us DCMirrMfNT Of EDUCATION 
Office o( Educational Rtafturch and tmprovemeni 

EDUCATIONAL RESOURCES INFORMATION 
CENTER (ERIC) 



rThiS documenj has b«en reproduced a» 
received from the person or organi/ation 
originatir>g it 
n v>no' change* br#n made to improve 
reproduction quality 



• Points of view or opinions stated mthisdocu 
meni do not necessarily represent otticial 
OERl position or policy 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) ' 



The Reliability, Sensitivity and Criterion*Related Validity of Concept Comparisons 
and Concept Maps for Assessing Reading Comprehension 

i 

Richard Parker 
Gerald Tindal 
University of Oregon 



Running Head: CONCEPT MAPS 



3EST COPY AVAILABLE 



Concept Maps 
Page 2 



Abstract 



Most current reading assessment methods do not reflect the reading comprehension conslnict which has 
emerged from infonnation processing research. Current methods rarely account for differences in relevant 
background knoMfledge or schema held by students prior to reading , and are insensitive to the structural 
nature of text infonnatH>n and s.udent Icnowledge. This sttdy investigated the reliabdity. sensitivity, and 
cnterkm-related validity of concept comparison (CC) ratings and computer-derived multicSmensional scaling 
(MDS) maps for reading comprehension assessment. Reliability was assayed by comparing OZ ratings and 
maps produced independently by five teachers while they read eight 2SO-word passages from science and 
social studies texts. For three of the eight passages, sufficient interrater retabillty was obtained. For the 
three reliable passages only. tYM methods were applied to assay instrument validity vyith 104 reading 
disabled Junior and Senior High School students. First, a randomized control group design was used to 
compare CC tasks comptet"^ before and after students had read related or unrelated text passages. 
Students reading the related passages produced post-reading CC scores significantly more ctosely related 
to expert teac.ier scores than dkJ readers of unrelated passages. Second, student and expert CC score 
siiralarities were conelated with student scores on two classes of external measures: (a) extant vocabulary 
end reading comprehensk)n scores from published, norm-referenced reading tests, and (b) maze tests, 
multiple choice questk)ns. and oral reading fluency performance— ail based on the reading passages. The 
three passage-based measures were substantially related to Post-reading CC scores, but not to Pre- 
reading CC scores. Standardized '.est scores were not significantly related to either Pre or Post-reading 
CC scores. The reliability and validity results were interpi eted as supporting further valuation research on 
the use c4 concept comparison tasks and derived MDS maps for assessing reading comprdhenskHi with 
oWer disabled readers. 



ERLC 



3 



I 



Concept Maps 
Page 3 

The ReNabiiity. Sensitivity, and Criterion*Related Validity of Concept Comparisons 
and Concept Maps for Assessing Reading Comprehension 
Over the past decade, cognitive processing research has played a major role in explicating the 
constnjct of reading comprehension (Trabasso. 1978; FreeJie, 1979). Central to the current view of 
reading comprehension is the notion that a reader integrates information s/he reads from text into a trire- 
existing, organized network of concepts and information, or •schema' (Anderson. 1977; Spiro. 1977; 
Rumelhaft & Ortony. 1977). However, most current reading assessment methods do not reflect the reading 
comprehension constnict which has emerged from infonnation processing research (Kirsch and Guthrie. 
1980;Curtis&Glaser. 1983. Johnston. 1984). First, current methods rarely account for diflerences in 
relevant background knowledge or schema heM by shjdents prior to reading. Furthennore. the infonnation 
processing demands placed on examinees by test tasks or items often are not ctosely related to the 
informatk)n demaiKls placed on them by the text (Surber. 1984). There is a growing professk)nal view that 
the lack of a sound psychok)gk:al bosis for reading comprehenskm tests has resulted in inappropriate types 
of test items being presented, and inappropriate types of responses being demanded of the student (Linn, 
19a2; Glaser. 1981 ; Messick. 1980; Johnston. 1984). 

Because of the inadequate relatkKiship between "knowledge stnicture of the examinee and that of the 
test", most standardized reading comprehenskm tests have been characterized as being "atheoretfear 
(SchwriTtz. 1984) or lacking constmct valklity (Wrsch & Guthrie. 1 980. p. 81). This fomi of valkJity is rarely 
addressed by test producers (Johnston, 1984). yet there is a growing recognitton of the primacy of construct 
validity over the traditwnal categories of criterion-related and content vaWrty (Messick. 1981; AERA. APA, 
NCME, 1985). Guk)n (1978) states that the category of "content valWity" shoukt be dropped in ft.vor of a set 
of content-oriented rules for test devefopment. In the same v jtn. Anasta 1 986) concluded that "all 
validatkMi procedures contribute to constnict vafidation and can be subsumed under r (p, 12). This 
encompassing notkxt of constmct vaMHy has encouraged theorists with a cognitive processing point of view 
to suggest fundamental irnprovernenu in reading comprehensfontest^^ Whereas test devetopers 
traditk)nally have been concerned with adequately sampling behavioral (Aiken, 1979; Anastasi. 1976) or 
content (Brown. 1976: Thorndike & Hagan. 1977) domains, that foe :s is shifting. Partly in response to past 

ERIC 4 



Concept Maps 
Page 4 

difficulties in defining these behavioral and content domaino. tfie focus has shifted to defining what cognitive 
processes and structures are represented iMth in test content and requisite student responses (Kirsch & 
Guthrie. 1980; Guion. 1978). 

A second shift in psychometric thinking about validity is related to the uses of test scores. Test use in 
schools has social consequenc«» for teachers and students. Tenns such as 'decision validity", 
"discriminani validity", and treatment validity' are used increasingly to refer to test score interpretation and 
test score use in decision making (HamMelon. 1980; Messick. 1 981 . 1 989). VUid test scores are sodalty 
valued and their use is consistent with schools' broader misskMi and goals (Messk:k. 1988;. Johnston 
(1984) notes that the concept of test validity has been moved back to its instmctkxial context, and suggests 
that future validation studies inchide instructkMial interventkKUi. 

Rom the cognitive perspective. rea<fing comprehension tests must reflect both organizatkm of prior 
knowledge (pre-reading schema), and selection and organization of key concepts from text (Johnston. 
1984). Researchers have therefore sougM a standard symbohc notatkxt for displaying the content and 
stmcture of both the tei^ and the reader's recall of text: "where the content and stnjcture of both...can be 
specified, the two structures can be compared" (Meyer & Rfce. 1 984. p. 32(9. Kirsch and Guthrie (1 98(9 
also seek a method for matching "the knowledge structure of the examinee and that of the test" (p. 81). 
Reading comprehenskxi codd then be described in tenns of structural and content differences between text 
and reader's cognition. 

Several fonnal psycholinguistk: models exist for information structure in text and/or knowledge 
(cognitive) structure in the learner. Most imply that informatkxi is stored as abstract, non-spatial, non- 
analogfcal semantk: or prepositional networks, with rule systems whfch can be made expKcil (Anderson & 
Bower. 1973; Rumelhart. undsay & Norman. 1972). These models (as described by Meyer & Rfce. 1984) 
are not well suited for assessmr^nt. because they: (a) are laborkjus to apply, and require a high level of 
expertise, (b) have untested reliability, (c) are often tied to text converttons. and cannot measure pre- 
reading knowledge schema, (d) focus on detailed mk:ro-level analyses of very short passages. 

Alternative models are offered by cognitive psychologists who contend that "...humans use frameworks 
similar to geometric spaces for organizing or perceiving many types of objects or concepts' (Fenker. 1975. 



Concept Maps 
Page S 

p. 39). Johnson-Laird (1 983). argues that prepositional nnodels are too narrow to reflect the cognit^ maps 
or 'Cental Models" which we use to capture key perceptual and logical lelatiooshlps. Similarly, van Dijk and 
Kintsch (198^ introdiiced '^Suatior. models'*, to respond to deficiencies noted in narrower prepositional 
models (Brewer. 1987). 

The assessmem method investigated in the presem study is consislem 
Models in that relatk)nsNps among concepts are depicted as spatial anangements. and miy be interpreted 
in either concrete/perceptual or altstract/propositional terms. The concept maps produced in this study are 
also similar to those vocal)ulary- and te)d-nuv>s used t)y teachers to help teach contemvocab^ 
explain key concepts in text (NMes. 1965; Hauf. 1971; Heimlich & Pitteknan. 1986). These devtees kKdude 
two-dimensional %vet)s" of key concepts, characters, or events (with connecting lines sometimes labeled ) 
and hierarchical branching trees, with a broad topk: as the tmnk. and details or subordinate ooncef^ at the 
ends ot branches (Calfee & Dmm. 1986; Reutzel. 1986; Holley & Oansereau. 1984). In a variety of fdnns. 
these semantic maps have demonstrated usefulness as learning tools (Guri^zenblft. 1989; Nteughan. 
1984; Reutzel. 1986; yoss. Greene. Post. & Penner. 1983). However, assessment in this area has 
unfortunately lagged behind instruction. Researchers lack proven, replicable methods for (a) producing 
maps and hierarchical diagrams from text, and (b) using these same structurally sensitive methods to 
measure student learning (Surber. 1984). 

The purpose of this study was to use stmcturally sensitive methods to assess reading comprehension, 
including measurement of pre reading schema, text stnjcture. and post-reading semantic knowledge. Our 
goal was to use a spatial measurement method. fblk>wing Johnson-Laird*s (1983) hypothesis that physk:al 
space may sen^e as an anatogue for one or more dimensions of perceptual/conceptual reality. Two 
measurement methodotogies (one primary and one supplemental) can potentially address this need; 
however, to date their applk^ation to reading comprehension has been very limited. 
Multidimensi onal Scaling and Hi^fafph iCfll Chffttf Anaty^ 

Both muitidjmensk)nal scaling (MOS) and hierarchical cluster analysis (HCA) (Preece, 1976; 
Shavelson. 1974) yieW graphk: displays of key topics or concepts, wvhere spatial proximity or linkages- 
depict similarity or ctoseness of relationship. MDS yieMs a map of concepts represented as points in two 



Concept Maps 
Page 6 

(or more) dimensional space, while HCA yields a branching tree, with concepts at the ends of the tKanches 
connected to a common trunk. MDS tjegins witt» judgments on the closeness of relationsnip of pairs of 
important concepts or key vocabuiaiy words. The MOS map distances may then be analyzed through HCA 
to produce a ckister tree. 

MDS maps may serve three purposes: (a) improved comprehenskm and oommunicatkNi of conylex 
relationships among concepts, (b) verification of hypothesized concept patterns (comparing obtained 
configurations with external criteria), and (c) interpretatkm of map dimensions (Davison. 1983). Only the 
first two appycations are relevant to this study. Concept maps can help provkte and communicale meaning 
through the identificatran and labeling of (a) concept ckisters. (b) relationsMpe among concepts and concept 
dusters. (<^ hierarchk»l (subordinate) relatkmships among concepts and concept dusters. Interpretation of 
c incept clusters and inter-concept retatkxiships is demonstrated on MOS maps (Figure 1) from a ocncept 
comparison rating task (Figure 2). completed after reading a 250 word science passage. "The Heart" (Figure 
9. A more detailed explanation of the concept comparison task win be provkled later. 

i ■ 
Insert Figure? 1 . 2. & 3 about here 



In Figure 1 . MOS procedures were used to ptot eight important concepts from a science text passage. 
In the top map. meaningful concept dusters were objectively ktontified. then outlined and labeled. In the 
bottom map. map interpretation highlights relaiionships rather than chisiers . rhe rr lationship labels are 
similar to those listed by Hdley & Oansereau (1984). Frederiksen (1975). and de Beaugrande (1980). 
Although mainly objective procedures are used to produce MDS maps from painvisd ratings, subjedive 
judgment is required for map interpretation, as weH as knowledge of the content area and the particular text 
passage. 

Interpretatkx) of the HCA duster tree indudes (a) seled^ the most defensible branching level(s). and 
(b) providing definitions or descriptions for the categories (dusters) at those level(8). Cluster analysis has 
proved valuable in this secondary analysis role (Coxon. 1982: GrWeth. Horn. OeNisi. & Kirchner. 1985). 
Figure 4 displays a HCA sdulion for the MDS map configuration. The scree pkM (explained later) beneath 



ERIC 



7 



Concept Maps 
Page 7 

the tree indicates that a three-ckister solution is mosl defensible; however, a five-cluster solution is also 
interpreted on the tree for demonstraticn purposes. 



Insert Figure 4 atxxjt here 



The trunk" of the tree is lat)eled with the f>assage title. Ihe HearT. Note that the Ixanch lat>els for the 
three-duster solution are those used for the dusters on the first MOS map in Figure 1 . The cluster tree thus 
provides a third, coniplementary interpretation of an MDS map configuration. 

Figures 1 and 4 demonstrated the first main purpose of MDS— improved understanding and 
communication of oomptax semantic relationships. The second main MOS purpose— verifying concept 
patterns— can use externally produced "experf maps as standards for evaluating an individual student's 

conceptmap. The expert standard nrao and learner rnaps can be ooiripared through analysis of 
concept cluster membership and/or simHarity of inler-concept map distances. Qualitative comparisons 
between standard andileamer maps also are possible by interpreting map configuration differences as more 
or less serious or trivial 

Figures 1 and 4 demonstrate the strength and limitaiion of the MOS mapping procedures. The spatial 
dimensions ore not well suited to displaying syntactic or mechanical text-based structures or detailed 
networks of prepositional relationships. 1 he maps do. however, provide a very flexible "problem sp&ce** for 
demonstrating a range of semantic relationshif9$. including both abstract and perceptual anilogue 
relationships. In this way. they most closely approximate Johson*Laird*s (19d3) Mental Models conntnict. 
Although the mapped elements in Figure 1 are 'micro-units* (individual vocabulary terms), the interpreted 
map depicts a "macro-leveT stnicture of total content organization (Meyer & Rice. 1 934). The semantic 
maps appear equally well uiHed to measuring pre- and post-reading knowledge stnictures. and semantic 
relationships in text. 

Neither MOS nor cluster analysis is as wel validated as the more common parametric multivariate 
te^niques of factor analysis and discriminant hjnction analysis (Oavison. 1 983; AMenderfer & Blashfield. 
1984). However. MOS is supported by a body of psychometric research, summarized in recent reviews 

Er|c 8 



Concept Maps 
Page 8 

(Carrol & Arable. 1980; Young. 1984). textbooks. (Davison. 1983; Schiffman. Reynolds. & Young. 1981) and 
dedicated journal issues (f ppitod Pgychri^rfll Mftngyrftmftrt, Viol. 7. No. 4. 1983; Psychometrika . Vol. 51 . 
No. 1 . 1 98Q. While MDS lacks the staKstical power associated with nonnal distributton assumptions and 
inten^rattomeasurament scales, it oRersdistkKt benefits. Foremost are that (a) MDS sdutkms are oasily 
interpreted, (b) MDS provides vafid results with ordinaHy-scaled data, and (c) the methodology is suitable for 
small sets of obsenfatkm (Schiffman. ReynoMs. & Young. 1981). In additton. MDS can usually fit an 
appropriate model to the originai data in fewcf dimensk>ns than factor analysis (Wilkinson. 1 90^. 

HCA.whk:hisrelegaledinthis8hidytosuppiementafy analyses, is consMered an "exptoratofy" 
technique— seMom recommended for pmmy analyses (Everitt. 1988. p. 604). Together. MDS and duster 
analysis offer spatial maps and hierarchica] trees whk:h are similar to the more flexible spatial cognitive 
models (Johnson-Laird. 1980; Hdley & Dansereau. 1980; Dijk & Kintsch, 1 983). as weH as the maps 
traditkNiaUy constnicted by teachers kituitively and by hand. Unfortunately, relatively few studies exist in 
which the methods have been appied to reading or other student learning. 
Multaimensk)nal Scalkyi and Student Laafrjny 

MuKidimensk)nal Scaling has been used to study changes in students' semantic structures following 
instnictton k -Hx^i studies (Stasz. Shavelson. Cox. & Mocre. 1976). research design (Fenker. 1975). and 
psychotogy (Weiner ft Kaye. 1974; Deikhoff. 1982). Fenker (1975) conducted two studies matching student 
MDS ..laps with those from subject matter expcts— both before and after instmction. The ck)seness of 
relatk>nshlp of pairs of "research design" concepts were judged by eight experts, and then by twenty 
students enrolled in the university course. The MDS maps produceo by the experts were substantially 
similar. Student maps showed only sNghtiy stronger agreement with expert maps from before to after 
mstnjctk)n. !n the second study. 27 new students were additionally directed to give special attentton to 
learning the key concepts aixJ their interrelationships. Post-instructk)ri results demonstrated greater 
similarity between student and expert maps. In additkxi. a signifkant relatk)nship was found between 
st^xlents' course grades and the similarity of their own maps with the experts'. 

External criteria such as course grades and test results have also been used to help valkfate concept 
comparison (CC) scores and the derived MDS maps (Brown & Stanners. 1984; DiekhoN. 1983; Stanners. 



ERIC 



Concept Maps 
Page 9 

Brown. Price. & Hobnes. 1 983). Diekhoff (1 983) co(r.pared multlple*choice. essay test, and CC test results 
by 120 undergraduate students enroHed in a psychology dass. Correlations between the CC task and the 
other two test forms were .44 and .58. respectively, leading the author to conclude that "relationship 
judgment tests tap bctfi definitional knowledge of the sort measured by the multiple*choice tests ... and 
stnicturat knowledge of the sort measured by essay tests' (p. 230). 

In two studies. Stanners. Brown. Price, and Holmes (1 983) compared peif ommnce by 64 psychotogy 
students on a CC task with three types of short-answer essay questkKts on the same content: del initkMi 
questions. applteatkMis, and questtons requiring discusskxi of relatk>nships. Following analysis by MDS. CC 
scores correlated .66 with a composite of the three essay question types. The authors stated that 

the concept comparison task wouM appear to be useful whenever the focus of interest is on a complete 

pattern of relationships among units of knowledge. The rating data are relatively easy to gather and. 

when analyzed by multidimensk)nal scaling. alk>w both visual arKl quantitative fonns of representatton. 

The results ... provkle evkience that such representations reflect actual knowledge of conceptual 

intervelatk)n8hips (p^ 863). 
MuHMimenstonal Scaling and Readino Comprehension . 

More directly related to the present study are the few applk^ations of MOS to expository and narrative 
reading passages (Bi&anz. incite. Vesonder. & Voss. 1978; LaPorte & Voss. 1979; Beaugrwde. 1980; 
Stanners. Price. & Painton. 1982). These studies pi ^uced two-dimensional cognitive stnicture maps from 
student recall of story elements, and compared the student-produced maps with either pre-reading mitps or 
'expert' maps. Stanners et al. (1982) had 60 college students rate all possible combinattons of five fictk)nal 
characters and three settings after reading an 0. Henry short story. Most of the MDS generated maps 
contained two dimenskHis: time sequence and character-sening connectkMts. A second finding was that 
mapped configurations of story elemeris were found to change as a functkm of pre-reading the text. 

L^Porte and Voss (1979) exptoied changes in cognitive maps produced ty college students before 
and after text reading. Students in a control group also completed the concept-comparison tasks, but dk) 
not read the two. 1 00-word descriptive passages from which the words were drawn. Students judged 
relationships between vocabulary pairs immediately after reading the passages and again. 48 hours later. 



10 



Concept Maps 
Page 10 

Changes in concept ratiriys l>etween the pre* and post-reacfing assessments accurately rellected the 
sut)jects* increased understanding of the story. The authors also found that the ease of delayed passage 
recall was due to the similarity of the story structure and students' pren^eading schema or Itnowledge 
structure structure. 

Two of the precedinp studies (LaPorte & Voss. 1979; Stanne^. Price. & Painton. 1982} have fbcussed 
on Davison's (1 983) third type of MOS application: dimensional intf ;rpretation to summarize a map 
configuration. That use of MOS is parallel to fiK^tor analysis, where the researcher see^ 
factors with efficient explanatory power. WiNn the present study, however, the focus is on concept 
configuratioas— clusters and relationships. Reducing data to two dimensions greatly reduces the method's 
diagnostic utility (Shepard. KIpairick & Cunningham, 197^. 

The few studies applying MOS to studem learning and reading in particular However, 
those reacSng st* tfies have employed a very limited numl)er and variety of passages, mainly from adult-level 
reading material and with at)le readers Map interpretations most oRen have t)een dimensional rather than 
configurational.reducir^ their potential for diagnosis and instniction. Although «he validity of MOS 
procedures has t)egun to be addressed in the few studies just dted. reKatriWy has not. 
EufBfiSfi 

This study investigates the use of conceo* comparisons and spatial m^ 
comprehension of expositoiy reading passages by Jr and Sr. High School shjdents with reading disabilities. 
The study was conducted in two phases, addressing instniment reliability (Phase 1 ), and instrviment 
sensitivity and criterion-rolated validity (Phase ^. The central question of Phase 1 was: After reading 2S0- 
word science and social studies textbook passagos. will teachers independently produce similar concept 
comparison (CC) ratings and MDS maps? The usefulness of MOS in assessing reading comprehension 
depends partly on the reliable identification of "expert' maps to compare with pre- and post-reading student 
maps. In Phased pre- and post-reading CC scores and MOS mapsfrom disabled readers!^ 
Senior High Schools were compared with the expert teacher mops and with four external criterion reading 
measures. 



Q 11 



hnimiimrn'iaaaa 



Ccncept Maps 
Page 11 

Phase 1 : Instrument Reliat>ility 
Method 

Eight 2S0MMrd passages were ^elected from elementary level social studies and science texts (Holt 
Sdenoe.Hoft General Science. Heath LMe Science. Hemh Social Studies) The content of the selected 
passages, wnn their F;y readatxiity levels, are: One-Celled Organisms (S.Q). 'igneous Rocks' (5.^ . The 
Heart* (5^. The Seashore' (7.5). The History of Texas' (4,5). Regions of the 'Soviet Union' (Q. The 
Sl(eletal and Muscular Systems (7). Limits on Animal Population Growth (7). Selected passages are 
induded in Appendix A. 

Passages were selected to be cohesive and seV-contained within a 2S0-word limit, and typically 
contained at least one central idea and 8 to 12 key content-related vocatMjIaiy tenns. Passages were 
minimally edited to delete 'askJes'. references to charts, figures, and text located elsewhere, and sentences 
of only peripheral reference. 

Eight key vocatHilftfy tenns were selected from each passage for pair-wtse judgments within a concept 
comparison (CO) test. Key vocatxilary" were words with central importance to the passage, inchiding twth 
content words and non-content words with content-specific meanings within the text. Words selected 
included all those highlighted l>y text putMishers through txM/italk: type, undertining. or mvgin notes. Key 
vocabulary" and "concepts" are used interchangeably in this paper 
Concept Comparison Tasks 

For each selected passage, all paarwise combinatk>ns of the 8 key vocabulary tenns were listed in a 
"Ross ordering" sequence to avoM contaminating order effects (Cohen & Davison. 1973; Davison. 1983). 
Although a minimum of 9 concepts are u commended for a 2-dimensk)nal MOS map (Kniskal & Wish. 
1978). that recommendatk)n assumes that only one CC task is conducted, and can be >eakened 
somewhar (SchtfTman. Reynokts. & Young. 1981 . p. 24) for multip:d ratings as in this study, where ratings 
for each passage were obtained (and then aggregated) from five different teacher experts. 

Beside each pair of concepts, respondents used a 4 point scale to judge how ck)sely the two terms 
were related or connected in the passage— i. e.. how much the tenns "had to do with each other or to what 

Er|c 12 



Concept Maps 
Page 12 

extern they "coiiW be used to descnt>e each other. The cues "dose relation* and "WiSe or no relation* were 
attached to the «wo extremes of the scato. The CC task yielded a set of 28. 1 -4 ratingf> on each passage 
fiom each teacher (see Figure 2). 
Respondents 

The 1 5 "experr respondents, all employed l>y a mral. Pacific Northwest school district, included two 
district coordinators and 13 reading spedaKsts and Special Education teachers from six Junior (Gr. 6-8) and 
Se?iior (Gr. 9-12) High Schools. Of the Junior High School teachers, three taught in Special Education 
resource rooms (PL 94-142 categoricaO. and three in Chapter 1 (remedial Fiveof 
the High School teachers taught Special Education, and two Chapter I. For each of the eight passages, five 
teachers separately completed a CC rating task. No teacher rated the same passage tvfee. 
Pro cedufe 

The "experf rateTs first read the 250-wud passages and then independently completed related CC 
tasks. While making concept comparison judgments, they were encouraged to kK>k back at the passage 
and to change initial raftings if they wished. No time limit was set for the task; most respondents required 7 
to 9 minutes to read and rate each passage. Each teacher completed three or four CC tasks during each of 
two sessions. Members of the research team introduced the task to the group, and were present through 

\ sessNNis to proctor and answer questions. 
Data Analyses 

Interrater agreement was first cak:ulated for teachers* concept comparison ratings using two indk:es: 
the intraclass conelatk>n (Brennan. 1983; Cronbach. Gle^. Nanda. & Rajartnam. 19/2) and Cohen's 
Kappa (Fleiss. 1 981 ; Cohen. 1968). Next, for only the most reliable passages. HCA was conducted on 
map clusters, and agreement on cluster membership was assessed (Rand. 1971; Moray & Agresti. 1984). 

Results 

Concept comparison ratings from five teachers were analyzed for each of the eight passages, using the 
intraclass conelatton coefTirem (Brennan. 1983) and Cohen's Kappa (Fleiss. 1981). Two nethods for 
improving the interpretabitity of Kappa are (a) calculating the ratio of obtained Kappa to the maximt -n 



ERLC 



13 



Concept Maps 
Page 13 

Kappa obtainable (Brannan & Predigar. 1981). and (b) dWefentially vveigMing scores by the degree of 
dtsagreenent on the ordral fating scaie (Cohen. 1968). Table 1 presents these measures of agreement 
for five "experf raters on the eight passages. 



Insert Table 1 about here 



Intradass correlations are al moderate to high, while simple Kappas are more variable and lower, ranging 
from 27 to .SI; values at .40 and above indicate V><Kragroemem beyond chance 1981). 
Reconsidering Kappas in ratio to their maximum possible value (Kappa Max.) yielded substantially higher 
values (.35 - .78 r&nge). SImiarly. differential weighting degrees of disagreement increased Kap^ 
- .15 points. From the tabled information, three CC tasks— The Heart". 'Igneous Rocks', and "The Skeletal 
System'-^demonstrated sufficient reliability for use with stud^ For 
each ol these passm^s. the concept comparison scores were average across raters in preparatk)n for the 
second pfiase of the study. 
Mao Confiyiufatton Reliability 

The preceding reKabMly indices were based on raw CC rating scores. Reliability of (nap chistering was 
next assessed, but for only the three most reliable passages. For these passages, an MDS map was 
produced for each of the five raters, using the stand-alone ALSCAM statistical software (Young & 
Lewyckyj. 1973) with the classical non*metric (CMOS) algorithm. The goodness of fit of each map to the 
rating data was first assessed through KruskaTsFonnulal Stress (Davison. 1983). AH but one of the 
fifteen Stress values were bek)w .02. representing a very good fit for two dimensknis and at tohst nine 
concepts (Kniskal & Wish. 1 978). However, the smaH number of mapped concepts may have been 
somewhat overfit to two dimenskms. artitictaly towering Stress vakies. 

Agreement among the five MOS map configurations was assayed by (a) comparing inter*concept map 
distances through the intradass correWk)n. and (b) comparing duster compositton through the Rand 
statistic. EucMean map distances between aH possible concept pairings (28 ki all) are anatogous to the 
original 28 CC ratings. The intradass correlation reliabilfty estimates for map distances were: The Heart** 



Concept Maps 
Page 14 

.68; Igneous Rocks" .69; The Sketadal System" .83 all significant at p< .01. 

To assess agreement of cluster composition from the maps, the numt)er and composition of dusters 
first had to 1)0 determined. Although concept clusters often can l)e discerned visually, a more systematic 
procedure was used: HCA accompanied t)y scree plots pavison, Richards. & Rounds. 1986; Coxon. 1982). 
The Group Average clustering algorithm (Sneath & Sokal. 1973). was used, as it produced interpretal)le 
solutions for these data and i^erfbnned well in Monte Carlo studies (Milligan. 1980. 1981). 

On a cluster tree, each tmnchHig level is a different potential clustering solution. The optimal clustering 
level(s) are identified on a scree plot of ''numl>er of clusters" tyy "joining distances" (Mojena. 1977; 
AMenderfer&BlashfieM. 1984). AsinfMor analysis, a flattening of the scree line indicates the optimal 
partition. These procedures identified one or two optimal clustering solutions for each rater for each map. 
Following map duster identification, agreement on duster meml>ership was assessed using Rand's statistic, 
which was devised for this very purpose (Rand. 1971). A chance-corredion for the Rand, "omega" (Q). was 

used, which is scaled from 0 (chance agreement) to 1 (perfed agreement) (Morey and Agresti. 1984). The 

Q ranges (and median^) showed uniformly high duster agreement: *The Hearf .73. (1 .C). i .0. Igneous 

\ 

Rocks" .48. (.68). 1.0; "the Skeletai System" 1.0. (1.0). 1.0. In summary. reasonat)le inteaater reliat)ility 
was otitained fr>r these three passages. t>ased on CC scores, map distances, and map dustering. 

In preparatfon for Phase 2 of the study, an average "expert map" was th^n created for e ach of these 
three reliat>le passages. First, the five teachers' CC ratings were "externally averaged" (Schiffinan. 
ReynoMs. & Young. 1981 . p. 179) For each average data matrix an MDS map was then processed through 
ALSCAb4's dassical non-metric algorithm. The more complex Replk^ated algorithm (F)MOS) produced 
nearly kientk^l dusterings to the sknpler CMOS solutk>n. The main advantage of RMOS is its al)i{ity to 
de$crit)e "dimensk)nal variatkm" among indivMual respondents, which does not address our goal of 
producing a valid average map (Schiffman. et al. 1981 . p. 65). Therefore, only the CMOS procedure was 
used in this study. 

Optimal chister sdutkxis on the average expert teacher maps were then Mentified through the HCA- 
plus-scree ptot procedure descrit)ed eartier. These three average maps, with optimal clusters outlined, are 
presented in Appendix B. 



Concept Maps 
Page 15 

Phase 2: inslrument Sensitivity and Criterion-relaied V^ity 
The purpose of the second phase of the study was to invesUgate the sensitivity and criterion*related 
validity of studemCC scores ai^d related MDS maps for assessing reading co^ Twomain 
comparisons were conducted. To gauge sensitivity, students completed (Xs before and aRerread^^^ 
their pre* and post-reading scores were correlated with the average ''expetf CC scores. To determine 
validity, each student's degree oe association with "expert" scores was compared with his/her performance 
on two classes of external measures: (a) extant vocatMjIary and reading comprehension scores from 
put>ltshed. nomi-referenced reading tests, and (b) maze tests, multiple choice questions, and oral reading 
fluency performance-^ based %a\ the reading passages. 

Method 

Respondents 

This shidy was conducted in a west coast low-middle SES mral community with an economy 
dependent on the logging industiy. At the Jr. and Sr. High levels the lowe^ achieving nine percent of each 
grade cohort (approxinva!^ 33 m all) were enrolled in Chapter 1 (compensatory) or Special Education 
(LD category) programs in reading/language arts. From this population were sampled 240 students— all 
those for whom current standarcfized achievement data were availak)le. The high rate of at>senteeism. 
school transfers, and mcomplete test protocols reduced this i^ple to 104 by the end of the study. Yearty 
enrollment turnover was nearf ^ 40% for the district, and exceeded 60% for students in special programs. All 
data presented are for thel04 students, drawn from thirteen classrooms within four Jr. (Gr. 6-8) and two Sr. 
(Gr. 9-12) High schoote. 

Fifty-three of the 104 students were enrolled in Jr. High, and 51 in Sr. High. Forty-three attended 
Special Education resource rooms for longuage arts, and 61 received pull-out Chapter 1 assistance. 
Current standardized achievement test scores from the district-administered Metropolitan Achievement 
Tests ( ) were available for 81 of the students. For the remaining 23 students, current Woodcock Johnson 
(1 3). WRAT (5). Nelson Achievement "fosts l[2). and Iowa Achievement Tests (2) were available. Available 
scores included percentile ranks, grade equivalents, and normal curve equivalents. From technical 
manuals, all scon s were converted to comparable normalized percentiles for the summary provided in 



16 



Concept Maps 
Page 16 

Table 2. Because percentile scales have unequal units. tt)ese scores were then converted to nonnalized 
standard scores prior to further analyses (Anastasi. 1988). 



Insert Tat)le 2 at)Out here 



ANOVA perfonned on the extant vocatnilafy and reading cofnprehension scores showed no significant 
differences among grades at either the Jr. or Sr. High Schoolleveis. Tlierefbre. for Table 2 and all 
subsequent analyse. Grades &8 and Grades 9-12 were grouped together. Table 2 shows median scores 
around the 20th to 24th percentiles for all students but those enrolled in Sr. High special education. 
Instrnmentation 

Students wero assessed through four procedures. aU based on the three most reliable passages: (The 
HeorT. The Skeletal System", and ^Igneous Rocks'): (a) concept comparison (CQ rating tasks, (b) Maze 
(multiple chok:e ctoze) tasks, (c) sets of 1 0 multiple chotee questkwis. anc* (d) oral reading fluency. 

Concert comoaiiyn iCC\ tasks Three of the CC tasks completed by ti^achers were ^ completed 

I 

by students. Each CC task consisted of twenty-eight ratings of concept pairs drawn from a pcssage. 
Ratings were perfonned on a four-poin: scale to indicate the perceived relatednes/ ; of eadi pair of concepts, 
how much the two conceptt^ -tUKl to do with each other (see Figure 2). 

Maze tests. Multiple choice ck3ze tests (Howell & Kaplan. 1 980) were produced from each passage. 
Every sixth word was omitted from all but the first and last sentences of the text. The omitted words 
(approximately 35 per passage) fonned the pool or universe from which distractors were selected, with 
replace.iient. Oistractors were excluded if they were both syntactk:ally and semantically sensible within the 
sentence. For each deletk)n in the text, students selected one of five optk>ns. 

Multiple chofee ouestkms. A set oi ten. four-optk)n multiple cMoe questk)ns was devek>ped for each 
passage. One was a "main Mea" questkxi. and the other nine required recognition of important facts and 
reiatk)nships selected consensualiy from the text by two experienced reading teachers. With the exception 
of the main Mea questk)n, only text-explicit questk>ns were included. 



ERIC 



17 



Concept Maps 
Page 17 

Qfal reading fluency. Assigned students also orally read an entire passage while lx.ng audio-taped at 
ttie t)ack of the dassroorn. Tapes were later scored for oral reading error counts and for lapsed time, in 
order to calculate oral reading fluency— rate of words read conrectly per minute. 

CX^assessrnent was conducted in two stages. approMmately one inonth apart. Both stages followed a 
pre-postest control group design, with random assignment of groups to treatment conditions. At the first 
stage, two CC tasks were assigned to Jr. and Sr. High schools, respectively: The Heart* (Fry readability 
3.8). and "Siceletai and Muscular System* (Fry readability 5.4). Ourinsj the second stage students were 
reassigned to treatment and control groups . and students at both levels received the same passage . 
Igneous Rocks". The treatment group was administered a maze tei ' immediately after the pre-reading CC 
test, and completed multiple choice and oral reading fluency tests fdtowing the post-reading CC test. 
Design elements are summarized in Table 3. 



Insert Table 3 about here 



Slagfi-L On day 1 of the first stage, during reading/language arts classes, teachers demonstrated the 
CC task, from scripted instmctions. Students then were asked to complete the CC test for the passage at 
their grade level. FAee' suites were aUowed for the test, though aH but a few students finished before 1 0 
minutes. 

On day 2. each student was randomly presented with one of two text passages for silent reading-- 
either [sislQd or gpigisSfid to the concept comparisons compM The two passages 

were handed out to students in alternaiing order, according to classroom seating. The unrelated passages 
werefromthes -nescience texts, had not been previously studied or read, and were of similar readability 
levels as the test*related passages. There was no discusston or instniction of passage content either 
before or after the reading. Immediately afler reading, each studertt returned the passage to the teacher, 
and then crimpleted the post-reading CC test. 



ERIC 



IS 



Concept Maps 
Page 18 

StagftZ Approximately one month later, the research team returned to the school district for a 
replication and expansion of the Stage 1 design, conducted over a four-day period. This design entailed 
student reassignment to treatment (n « 43) and control (n s 49) groups (again by classroom seating). On 
day l.aN students completed preloading CCs based on the same passage/ Immedietely 
aften^ards. students completed Maze tests within a 25 minute set limit For both groups of students, the 
k^aze test was constructed from the passage they would read on day 2. The Maze test was administered to 
the control group to control for possible Maze influence on the post-reading CCs. 

On day 2 all students in the treatment group (n « 43) silently read tne related passage. Igneous 
RodtsT, and control group students (n s 4^ read an unrelated passage of similar readabflity from the same 
text. Immediately aften^ard. all students completed the post-reading CC test for Igneo^ Students 
in the treatment group then also completed a 1 0-item muKiple-chOfce test on the passage. On days 3 - 5 
each student in the treatmerrt group also read the Igneous Rocks" passage into a tape recorder at the back 
of the room. The uneven quality of audk> recordings reduced the number of useable oral reading samples 
to 38. i 
DataAnah/sis 

The first analysis consisted of a three-way ANOVA conducted for each of the three passages: "The 
Hearf. The Skeletal and Muscular System", and "Igneous Rocks". Two between-subject variables were 
included, each with two levels: Reading passage (Related. Unrelated), and Program (Special Education. 
Chapter 1). The within-subject variable was the repeated mb«:^ure. Time of CO administratk>n (Pre. Post). 
The dependent measure was the correiatkm coeffkdent between student and expert CC scores. In order to 
analyze Pearson ['s as test scores within ANOVA. thoy were first transfonned to Fisher Z scores (Hays. 
1981). A significant "Reading passage x Tirne" interactk>n was hypothesized, with smaller macn effects for 
the two variables. No signifk^int main effects or interactk>ns«vere hypothesized for Program. Asa 
secondary analysis, tor only those students who read the related passage, pretest at)d post-test CC expert 
correlatkms were tested for significant differences with the Hoteiling Wdliams Test of correiatk)n equality 
(OarKngton & Carlson. 1987). 



19 



Concept Maps 
Page 19 

The second major analysis was the intercorreiatton among scores from (9) pre* and post-reading 
concept comparisons (Fisher Z scores), (b) standardized Reading Comprehension and Vocatxdary tests, (c) 
Maze tests, (d) Multiple choice tests, and (e) Oral reading lluenci' samples. It was hypothesized that Post- 
reading CC scores would t>e significantly correlated with the other measures, while pre-reading scores 
wouldnotbe. These analyses were conducted to support the validity of the CC scores and m^ In 
pellicular, spatial maps appear to hold the potential for diagnosing students* understandings pnd 
misinterpretations of text, and planning relevant remedial instruction. To reinforce this priority, qualitative 
analyses of students' maps are presented first, then quantitative results. 

Results 

Qualitative Intemratation of Students Maps 

Maps of two students. Alice and Bob, with typical CC pre- reading (.12, -.09) and post-reading (.46. 
.39) correlations (with expert maps) are presented in Appenuix C. Agreement wtth the expert map of *The 
Hearf was measured bf interpoint map distances (Kendall Tau-B). and on clusterings (Omega transform of 
Rand's statistic). For ^Jice's pre-reading map. T-b « .08. and s Her post-reading map showed ]l= 
I2 - .36, and & e .61. Tor Bob's pre-reading map. . f-b - .12. and « .33. For his post-reading map, 
Lii2 r .40. and fit e .74. 

Alice's and Bob's maps can be qualitatively interpreted by comparing (a) their pre- and post- 
reading maps, and (b) their maps with expert teacher maps. Interpretations can be based on either the map 
distances among individual concepts or membership of outlined clusters. Both the average teacher map for 
The HearT (Figure 1 or Appendix B) and Alice's pre-reading map (Appendbc C) suggest a three-cluster 
interpretation. The expert map yields two clusters, interpretable as (lO "composHiai and basic movemenT, 
and (b) "main parts and connector^, with an "external pan" as an outlier. These clu ste^ are higher-order or 
superordinate concepts. Alice's pre-reading map configuration does not include those higher-order 
concepts. Instead, one large cluster exists, which is difficult to interpret beyond "everything but cardiac and 
tissue". In Alice's pre-reading map. "cardiac" and "tissue" are outliers, although the first tenn is used to 
describe the second in the passage. 



20 



Concept Maps 
Page 20 

By attending to iniier-conoept distances rather ttian only duster ineiia)er8tiip, we can conduct a more 
micro-level analysis of Aice's pre-reading map. Within Alice's large duster, "artery" is on the duster 
periphery; It is also isolaled on the expert map. iiowever the dose proxirnity of "contracts" and Ventricle* 
is difficult to explain, and *)estattrlMted to student miM Itispossilriethatsuchan 
uninterpretal)le relationship was due to random CC task ratings. However, random ratings are not 
ind:cated l>y the systematic relationship descrit>ed later t)^^ 

Alice's post-reading map more closely approximates the expert teacher map in that "cardiac" and 
tissue" are clustered apart from other concepts. In addition, the post-reading cluster of ^mlve","chaml)er. 
and \entride" approximates the expert teacher "Main Parts and ConnedoTc^ From 
the pre- to post-reading map, "contractsT has shifted from a cenfrd, integrated position to an isolated 
position. Even in this isolated posHion, it is in the vicinity of the "cardiac", 'lissM Note 
that "cardiac", "tissue", and "contracts" make up the X^omposttion and Basic MovemenT duster on the 
expert map. In summaiy, student Alice's post-reading map shows greater differentialkNi of concepts toward 
interpretal)le, higher-ordpr dusters. 

VVhfle changes in Aice'sinapiiK>reck)sely approximate the expert r^ map 
features imply comprehenskm prol>lems. First, the "arter/'-"atrium" connedion is not easily interpreted; 
"atrium" should 1)0 dosdyassodaled with >^ricle" and "chamb^^ Second, the proximity of "contracts 
with the "aftery"-"atrium" cluster is not easly interpreted. Both proi)lems couM l>e clarified and confimned in 
a student interview. AdiagnosticinterviewwoukJ t)e especially useful when the purpose of assessmera is to 
diagnosis misunderstandings and/or plan remedial instnidton. 

The main similarity t>etween Bob's pre-reading map (Appendix C) and the expert map for "The HearT 
is that "cardiac" and "contrads" are dustered together and separated from the other concepts. The two 
dher pre-reading map dusters are, however, difficult to explain; each has a memt)er fli^e" and "artery", 
respedively) which appears semantically less related to the other tvvo duster memt>ers. 

Bob's post-reading map more dosely approximates the expert teacher map in that the two ill-fitting 
duster members flissue" and "artery*) have drifted away, and the remaining four concepts have become 
realigned to form the "Main Parts & Connedor" clusiet In drifting away, "artery", "cardiac", and "tissue** 



Concept Maps 
Page 21 

have fonned a cluster which is difficult to interprel. However, "after/ is clearly the outlying meml)er of that 
cluster. The main comprehension problem implied by the post-reading map is the isolation of "contracts"— 
the failure to recognize its dose relationship to "cardiac" and "tissue". Again, an interview with the student 
over the map would hdp confinn the interpretations made on the t>asis of cfcisteriRg and inter-concept 
distances. 

In summary, the comprehension prol)lems inferred from student Bob's post-reading map appear less 
severe than those of AHce. Alice's most fundamental misunderstanding appears to be a confused "artery"- 
"atrium" connection). wMe Bob's shows a less central definitionalproblem— a misunderstanding of the 
"cardiac^-"tissue" relationsht). Indices of expert agreernerrt for post-readwigriiaps based on imer-concept 
distances (t - b) and cluster membersh^ (W) show that Bob (t - b » .40. W « .74) slightly outperfbmed Alice 
(t - b s .36. W S.61). These same indices indicate that both students made similar gains from their pre- to 
post-reading maps. 

MDS maps are worth interpreting only if the maps are reasonable stable, and show systematic 
differences between g<^ and poor reading comprehenders. These qualitative interpretations are therefore 
be supported by quantiiative analyses from control-group designs, with representative sampling of teachers 
and students. Results from Stage 1 and 2 help answer the question of instrument sensitivity, while results 
from Stage 2 address the question of criterion-related validrty. 
Senshivitv of Conceot Comparison Scores 

Sensitivity of CC scores was defined as systematic changes from Pre- to Post reading scores by 
disabled readers who received no preteaching or other assistance. The systematic dtanges hypothesized 
were toward closer agreement with the expert teacher CC scores. Results from th<ee-way ANOVA are 
presented for Jr. high (The Hearty in Table 4. for Sr. high fThe Skeletal and Muscuiar System^; in Table 5. 
and for both levels together (Igneous Rocks') in Table 6. 



Insert Tables 4. 5. & 6 about here 



22 



Concept Maps 
Page 22 

Table 4 presents n^ain effects and interactio is for the three vahat)les in accounting for Jr. High CC scores 
on The Heaif. Strength of relationship is indicated t)y the generalized conelation coefficient, n feta") 

(Hays. 1 981). Two of the first order interactions were significant, c^counting for 45% (Time x Read.) and 
10% (Time X Prog.) of the total variance, respectively. Although interpretation of main effects can l)e 
deceptive in the presence of significant interactions, one comparison stands out. The main effect for Time 
is much larger (74% of the variance) than that for Read. (10% of the variance), although we would 
hypothesize only a medium-small effect for i)oth. This difference can t>e explained t)y the tendency l>y fifl 
students to slightly improve in their CC scores at Post-testing (the Time varial)le). presumai)ly due to 
practice affect (as win be noted inTat>le 7). 

Tat>ied ANOVA Rec jIts for Sr. high on The Skeletal and Muscular System" were similar to those for 
Jr. high, and consistent with hypotheses (see Tak>le Sj, At the Sr. High, only one of the three first-order 
interactions was significant— Tune x Read." (41% of the variance). Both Time and Read, main effects were 
again f ignificant. with the much larger effect for Time (66% of the variance). The variat)le. Prog., did not 
contrilMJte significantiy^ 

The replication study in stage Kwo. with Jr. and Sr. High students together (Igneous Rocks"), produced 
results similar to the prevkHJS two analyses (see Tat>le 6). The two Time-related interactk)ns were 
significant. Ixjt only Time x Read." produced a sizeable effect (37% of the total variance, compared to only 
7% for Time x Prog."). Again. Time and Read, produced signifk^ant main effects, although only the former 
was large (66% of the variance). Pkits for the three most significant interactk)ns (p < .01) are presented in 
Figure 6. For the ptots. the Fisher 2 scores used in ANOVA were re-converted to Pearson I's. 



Insert Figure 5 ak>out here 



The three very similar interactk)n ptots indk^te that at txMh Jr. and Sr. High, students who read the related 
passage made signlfk:antly greater gains in CC scores than dkl those who read unrelated passages, 
regardless of the type of special program enrollment. 



ERLC 



23 



Concept Maps 
Page 23 

Table 7 presents CC means and SDs for the three passages. For students reading the related 
passages. Mean Pearson { s were .07 to .1 0 Mote reading, and .36 to .47 after reading. 



Insert Table 7 about here 



Although the ANOVA s discussed above provided pre- and post-reading CC score comparisons at the 
group level, they do not provide information at the individual student level. Individual-level results are 
essential if when individual diagnosis or placement decisions may fo^ Therefore, for 

only those students who read the "related" passages, the nuH hypotTiesis of no significant difference 
between pretest and post-test CC oorrelaUons with the expert scores was tested. The Hotelling^lliams 
Test of the equality of dependent Pearson correlations f^2' ^3) was used to compare pretest-expert 
and posttest-expert correlations (Oariington & Carlson. 1987). 

For only 6 of the 97 treatment group students were pretest— expert correlations stronger than post- 
test— expert correlatiof|s. and ncne of these differences was statisticaRy significant. In contrast, post-test— 
expert correlations were greater for 91 of the 97 students, and 36 of the HotelGng-Winiams Z scores were 
statistically significant at p < .OS. Out of 97 score comparisons a number of significant pairs would be 
expected by chance alone, so a Chi-sq jare test was perfonned on the proportion of significant versus non- 
significant findings. The resulting coefficient was highly significant: c^ (1 , N » 97) » 84.75, p < .0001 . 
Criterion-Rel ated V^ltdth/ 

The second major analysis was comparison of pre- and post-reading CC scores of Phase 1 1 treatment 
group students (those who read the related passage) with external measures of reading comprehension. 
Table 8 contains descriptive information on the CC scores, published Standardized Tests. Maze tests. 
Multiple choice tests, and Oral reading fluency which were •'^^correlated. 



Insert Table 8 about here 



ERIC 



24 



Concept Maps 
Page 24 

Table 8 dearV demonstrates the degree of students' read^ They averaged only 50% correct 

on the MuRiple choice test, and only 66% correct on the Maze (80-90% is an a>^^ ttwas 
hypothesized that post-reading CC scores would be substantially related with the other measures, unlike 
pre-reading scores. The correMton matrtx in Figure 10 shows smaH. non-significant relationships between 
the pre-reaoing CC scores and external measures. 



Insert Table 9 about here 



Pre-reading CC scores are significantly coaeiated only with their post-r^ In contrast, 

post-reading CC scores show significant, moderate size relationships wRh the Maze (r « .61), Oral Reading 
Fluency (rs. 57), and the Multiple Choice Test (r«.4^--al based on the san^ Ofthetwo 
standardized reading tests, only Vbcabdary was significantly related to other measures— the Maze (r = .43) 
and Oral Reading Fluency (r = .4^. 

To identify clusters pid outliers in the conelation matrix. Ward's hie^^ 
applied (Ward. 1963; BlashfiekJ. 198(9 (see Figure 



Insert Figure 6 about here 



The cluster tree indicates the relative isuMion of the pre-reading CC scores and the two standardized test 
scores. Post-reading CC scores cluster with oral reading fluency, and then with the other two passage* 
based measures, the Maze and multiple choice test. 

Discussion 

This study investigated the reliability, sensitivity, and criterion-related validity of concept comparison 
(CC) scores and spatial maps for assessing content-area reading comprehension of Junior and Senior high 
school students with reading disabilities. This method offers several advantages sought by reading 
researchers: (a) reading comprehension can be measured as change from pre-reading schema to post- 
reading semantic structures, (b) the same metric can be used for both the information structure of text and 



Concept Maps 
Page 25 

the PncMfledge structure of the reader, (c) the maps are diagnostic: they encourage interpretation of txm the 
reader is organizing or misorganizing Momialion, (d) the technique pennits multiple correct rnmrns from 
different teacher ^experts", (e) rather than isolated tactual recall . the network of reMionships among 
concepts is emphasized, (0 the dimensiony maps and hierarchical trees are 
common ctessroom use. 

First, this study demonstrated the interpreiability of student pre- and post-reading maps, through use of 
expert teacher maps as a standard. Two approaches to map interpretation seemed helpful: Interpreting 
concept clusters (and changes in cluster meml)ership). and interpreting Mer-concept distances (and sMfts 
in relative positions). A combined approach seems natural. Minimal interpretation of aKemative stnictural 
views was undertaken. As a consequence, those qualitative interpretations whteh were made «^ not 
forced. The Merpretattons earn credibility, however, only if the maps are s^ 
to other accepted measures. 

Besides map interpretabiKty. this study addressed three requisites of any assessment method- 
reliability, sensitivity, and vaHdity : (a) reliability of expert teacher concept comparison (CC) scores and MOS 
maps, (b) the sensitivity of CC scores to response changes foNowing relevant reading, and (c) concurrent 
criterkMvrelated valkMy: the relationship between CC scores and other reading measures. 

The reliability of on ly the teacher CC scores and maps was directly studied ; rehabiHty of student CC 
scores and maps was not. nor was the stability of teacher scores over time, n appears that CC tests are 
reactive: pre*testing appeared to systematically infkjence post-test results in the direction of greater 
similarity to the expeit map. 

The ^uestkKi of reliability of expert teacher CC scores and maps requires a qualified answer; six of tfie 
eight passages met the minjmum .70 to .80 reliability range for "early stages of research on predictor tests**, 
where the main concern is with group differences (Nunnally. 1978. p. 245). None of the CC tests met the 
.90 to .95 reiiabtiity "desirable standard" for individual-level decision making (Nunnally. 1978. p. 246). 
Three of the eight CC tests exceeded .80 retiabiWy (.81 . .81 . .87). justifying their use in the second phase of 
the study. 

Reliability indk:es of MDS map ckisterings were weaker. Only two of the Kappa/Kappa Max. fatk)s 



ERIC 



26 



Concept Maps 
Page 26 

were eubstanlial (above .70). However, the implications of this reliability figures for decision making based 

on a mapping test are not known. Substantially higher CC and map reliabiGties wouM have b^ 

if ISttgaltemative expert maps had been allOMred per passage. That move would have been supported by 

observattons of teachers' disagreements on the main Mea of a story. Iwo *oognWve stmctures* miy be 

equally defensible, and the potential for accepting alternative expert mi^ is a 

method. V^Tithin the constraints cif an initial study, however, it was neces^ 

aHow two aRemative expert maps. 

To speak of reliability of the test and MOS mapping technkpje in g 
reliability dearly depended upon the partknjiar passage. The ^ari^ ^ in reKabiiities among the eight 
passages appeared to belargely a fonctton of the key vocabulwy words selected. Therewereno 
constraints to key word selection; words were not required toconfbnn toone or afew ielatk)nship8 or 
dimensk>ns. e.g. "physk^al connection" or ''superoidination". Absence of selection criteria permitted a 
greater range of concept relattonship interpietations. and a greater variety of maps. In lighl of the tact that 
key vocabulaiyselectkN) was free to vary, the degree of reliability Thepresumed 
importance of key vocabulary selectk>n to CC test reliability couM be empiricaDy studied from the existing 
data base. 

The second major purpose, assessment of treatment validity, can be answered affinnatively . at least 
at the group level. Students did significantly improve their match with expert CC scores and maps after 
reading related passages. At the individual level most students (94%) improved their expert agreement 
from pre- to post-reading CC. but only 37% of the score improvements reached significanoe. Tik) 
Hotelling-William^ test of significance depends not only on the interconreiations among the three CC results 
(pre-, post-, expert), but on numb^^ of ratiw - - «nly 28 for this task. More concept comparisons wouM have 
greatly increased the number of significam individual "improvements". 

These group and individual treatment validity results were obtained despite the fact that the all 
students were deficient readers, and none receive pre-teachir»g or other instruction in the content area 
passages. Given those facts, the initial evidence on measurement sensitivity for disabled recders who 
received r»o Instruction is encouraging. 



Concept Maps 
Page 27 

The third research question, assessment o< coocurrert. criterioweJated validity also receives a 
tentative, affimuttive response. As expected, the post-reading CC scores «vere most dosely related wMi the 
other three passage-related oitefion measures— the Maze, multiple choice test, and oral reading fluency ([ 
« .61, .45, ^7). Among these four passage-related measures, the multiple choice test and Maze were most 
tightly clustered, followed by oral reading fluency and the post-rea<fing CO scores. Tie pre-reading CC 
scores, on the other hand, were not slgnlficantfy related to any measure tnit their post-reading CC 
counterparts. Pre-readktgCC scores were dear outliers in the clustering of the six reading measures. 

The largest matrix correlations were of only low-moderate to moderate size. The moderate reliability of 
the CC scores may have imposed a ceiling on these validity relationships. Other possible reasons for 
medium-low valdSy scores may reside in the external measures, themselves: (a) lack of stni^^ 
sensitivity (Maze. ORF. Mult. Choice. St Tests), (b) inability to account for pre-reading knowtodge 
differences (Maze. ORF. St. Test^ . (c) informatfon processing demands appear to differ from reading 
(Maze. ORF. Mull. Choice. St. Tests), (d) questions unintentfonally cuing responses (Maze. ORF. MuR. 
Choice. St. Tests). C<|pparing a new measure with defk:ient8taiidardcr»*ri^ measures wllal^^ 
m less than satisfactory' vaNdity coefficients. 

This study served its purpose as an initial iwvestigatfon of the reliability and validity 0^ 

unresearched assessment approach. However, it raised several questtons whk:h need to be addressed 

before these kmovalivetechnkiues are used outside an experimental setting. One questkm is how many 

different types of relatfonships among concepts can be pfolted on a twoHlimensional space while still 
rendering an interpretable map. Interpretation of the MOS maps intentionally was os6 based on map 

dimenskxts or axes (as in factor analysis), but rather on clustering of. and Euclklean distances within pfotted 
oonfiguratfons. This approach is legHimized by experts in the MOS fieM. though not frequently encountered 
in the literature (Davison. 1 983). However, a "problem space" of only two dimensfons may tend to limit the 
varietyof relatkNtship-types among concepts and chisters. In that sense, map dimensionalily may play a 
cnicial. underlying role in map vaNdily. 

Increasing the number cf map dimensfons in order to tess constrain the variety of kiterpretable 
relationships is not a practical sohjtfon. The small number of concepts ptotted wouM be serfously 'over fitlo 

er|c 2^ 



Concept Maps 
Page 28 

the higher dimensionaHty. and solutions would lack stat>ility. The question of a Nnm on the twsmbms of types 
of reiatk>nships among concepts has direct l)earing on how key vocabulaiy are initially selected. Map 
reKal)ility and interprelat)iMy need to be shidied under different vocatKJiary selectkxi guhMines. 

the diagnostic and instnjctk)nal utility of MOS maps will hinge in part on evMence that quiMita^ 
interpretatkNis have relial)ility and valMity. This study demonstrated qualitative interpretatkm of a few 
teacher and student maps without provkiing such evkience. A k>gical approach to vafidating a quarrtative 
map Niterpretatk)n wouM t)e to directly imen^ a shjdent before ^ 

eva!uatk)n of the m«4is by the same respondent. The intenriewsshouM be open-ended at first; then 
students coukJ react to their MDS maps. 

A second qualitative vaNdatkm approach might inchjde student selectton or free-hand constructk)n of 
spatialmaps. Both approaches coukJ help establish whether the MDS methodotogy um 
biases interpretatk)ns of cognitive stmdures. lnfonnatk>n from these approaches might also generate new 
approaches to MOS map interpretatkKi. 

Three types of map interpretatk>n were consktered, based on cluster membership, relatk>nships among 
indivkJual concepts, and hierarchk^al arrangement of concepts, it is not known whteh type of interpretatk)n 
could be most readily understood and cormnunicaled by reading speciaKsts and teachm Neither is it 
known if one method is better suited than another for different types of organizatton of expository text. 
Other semantic structure models (e.51 Kolley & Oansereau) provkJe alternative structures for text written 
with different types of concept organizatk>iJ. Further research is needed on these questkKis. 

Both interpretatkxis based on cluster membership (whether on the map or in a hierarchkral tree) rely on 
secondary hierarchk:al duster analysis. Cluster analysis has some notoriety for instabifity. and has been 
classified as little more than a heuristw (Aktendeifer & BlashfieM. 1 984). Considerable agreement was 
noted between ckister soiutkxis based on Ward and Average linkage algorithms. Other algorithms dkl not 
match well, however. The instability of cluster solutk>ns and the complexity of the analysis need to be 
weighed against the benefits. When ckjster defin(tk>ns are desired on the map (rather than tree diagrams), 
human judgments may suffk^. The ability of teachers to directly interpret map clusters would reduce the 
time and technk:al skilis required. Reliability studies are needed on this questton. 



Concept Maps 
Page 29 

The disagreements otitained among teacher raters laises the question of what constitutes an "experf. 
Perhaps sul)iect matter experts are required, rather than teachers who are more faimiliar wtth the textt)oolcs 
as teaching tools and with the information their students could reasonai)ly gain from the texts. Content 
imowledge also plays an unknown role in the imerpretation of map c^ relationships. Whatlevelof 
content knowledge is suffident? 

"Hiis study used only eight key vocabulary words per map. whereas most passages yieMed at least 
eight to twelve tenns. Eight concepts is a marginal numt>er for scaling fai two dimenskxts; nine or ten would 
be preferal>le. The biggest problem in kicreasing the number of concepts is the geometric increase in the 
length of the concept comparison task (28 comparisons for 8 concepts. 36 comparisons for 9 concepts, 
etc.). Incompfe.;^ bk)ck sampling schemes for reducing the number of necessary comparisons have been 
researched sn Monte Carto studies (Davison. 1983). Their stabiliry appears to depend heavily on the nature 
and content of the comparison task. No research was found on incomplete bk>ck designs with small 
numbers of concepts. That type of mvestigatkxi is urgently needed to help detennine the utility of MOS 
mapping undt: less co()troled text conditk)ns. 

Despite the many unanswered questkxis. this study supports the further kivestigatton of spatial maps 
for assessing reading comprehension. With the technk:al underpinning of MDS. spatial maps can 
potentially address several of the deficiencies attributed to most existkni reading assessment techniques by 
increasing numbers of professkxials who have adopted a cognitive processing view of reading 
comprehenskxi. At this point. MOS for reading assessment is suitable mainly as a research tool, requiring 
technological and statistical expertise. However, concept comparison tests can be efficiently produced and 
group administered. This fact shoukl encourage serkHJS conskteratkxi of the technique for selected reading 
assessment purposes if other studies further support its reliability, sensitivity, and valklity. 



ERIC 



30 



f 



' Concept Maps 

Page 30 

References 

American Psychological Association (1954). Tectinical recommendations fo r psvchotogtcal tests and 
diagnostic techniq ues. Washington, D.C.: A.PA 

American Psychological Association, AERA, NOME. (1985). Standards for educational and osychotoQical 
t^iog. VVtehington.D.C.:APA.« 

Aiken. LR. (1979). Psvchokxiical testing and agaessment Boston. MA: Allyn & Bacon. 

Aldendeffer. M. S. & BlashfieM. R. K. (1984). Cluster analysis . Beveriey HiHs. Ca: Sage Pi<blications. 

Allington. R. L. (1 982). The persistence of teacher betiefs in facets of the visual perceptual deficit 
hypothesis. Elementan/ School Journal 82 asi-as9 

Anastasi. A. (1986). Evolving concepts of test validation. Annual Review of Psychology . 3Z. MS. 

Anastasi. A. (1976). Psvchotooical tesiinff (3rd ed.). New Yoric Macmillan. 

Anderson. J. R. & Bower. G. H. (197^. Human associative memory . New York: Winston.; 

Anderson. R.C. (1977) Schema-directed processes in language comprehension (Tech. Rep. Mo. SO). 
Urt>ena: U. of Illinois. Center for the Study of Reading. July. 1977. 

AooBed Psvcholooteal Measurement. VU. 7. No. 4. (1983). M. L Davison & L E. Jones (Eds.) 

Annbruster. B. B.. & Anderson. T. H. (1984). Mapping: Representing informative text diagrammatically. In 
C. Hoiley & 0. Oan^ereau (Eds.). Spatial Leamino Strategies: Technioues. Applteations. and related 
teSuiB. New Yoric Academic Press. Inc. 

Armbruster. B.B. & Anderson. T>l. (1980). The effect of mapping on the free recall of expository text (Tech. 
Rep. No. 160). Urt>ana: Univ. of Illinois. Center for the Study of Reading. 

Anastasi, A. (1988). Psvcholook^al testing (3rd ed.). New York: Macmillan Publishing Company. 

BarufakJi. J. P. Ladd. G. T. & Moses. A. J. (Eds.) (1981). Heath science . Lexington. MA: D. C. Heath. 

Bayne. R. Beauchamp. J.. Begovich. C. & Kane. V. (1980). Monte Carto comparisons of selected 
clustering procedures. Pattern RecognHton . 12. Sl>6g 

Beaugrande. R. (1980). Text, discourse, and process Norwood. N.J.: AMex. 

Bisanz. G., l.aPorte. R.. Vesonder. G.. & Vbss. J. F. (1978). On the representatk>n of prose in memory: A 
multkJimensional approach. Journal of Verb al Learning and Vert)al Behavior . 

BlashfieM. R. K. (1 980). The growth of cluster analysis: Tryon, Ward, and Johnson. Multivariate Behavioral 
Research . 15. 439*458. 

Btoom. B. S. (1976). Human charact eristfcs and school learning . New York: McGraw Hill.; 

Brennan. R. L. (1983). Elements of G enerateabititv Th#oiy k>wa City: ACT Publicatkms 

Brennan. R. L.. & Prediger. D. U. (1981). Coeffk:ient Kappa: Some uses, misuses and alternatives. 
Educational and Psvchotogical Measurement . 41. 687-699. 



ERIC 01 



Concept Maps 
Page 31 

Brewer. W. F. (1987). Sctwmas versus mental models in human memory (187-197). in P. Morris (Ed.) 
Modefinq Coonition . Ney Yortc John WMey and Sons. 

Bridge. C. & Tiemey, R. (1981). The Inferential operations of children across text with narrative and 
exposPtory tendencies. Journal d Ra adino Behavior . 13(3). 201-214. 

Brown. F.G. (1976). Principles of educatjonal and n svcholoQical testi n? (Z.a'J. ed.). NY^ Holt. Rinehart & 
Winston. 

Brown, L T, & Stanners, R. F. (1984). The assessment and modification of concept interreialionships. 
Journal of Experimental Education — 

Calfee. R.. & Oram. R (198Q. Research on teaching reading. In M.C. Wittrock (Ed.). Handtx>okof 
Resaarch on Teaching (3nl. ed.). Ht. MacMiHan Publishing Co. 

Carrol. J. & Arable. R (1 980). MulUdtoiensionai scflGng. Annual Review of Psycholoov 31 . 607-49. 

Cohen. J. (1968). Weighted Kappa: Nominal scale agreement with provision for scaled disagreement or 
partial credit. PmtWtffgiCllI ffiillfllin 70(4), 213-220. 

Cohen. H. & Davison. M. (1973). Jifiy-scale: A FORTRAN IV program fOr generating Ross-ordered pair 
comparisons. Behavioral Scjanca 76. 

Connack. R. (1 971). A review of classification. Journal of the Roval Statistical Society (Series A) 134. 321- 
367. 

Coxon. A. (198^. The users oui Je to multidim ensional scaling . Exeter. NH: Heineman Educational Books. 

i 

Cronbach. LJ.. Gleser, b.C. Nanda. H.. A Rajaratnam. H. (1972). The deoendabilitv of behavioral 
asaSittSS. NY:Wiiey. 

Curtis. M.E. & Glaser. R. (1983). Reading theory and the assessment of reading achievement. Journal of 
EducatkMial Measurement. 2Q(2). 133-147. 

Oansereau. 0. R. McDonald. B. A.. Collins. K. W. Garland. J. C. Hdley. C. D.. Diekhofl. G. M.. & Evans. 
S. H.. (1 979). EvahiatkMi of a learning strategy system. In H. F 0*Neil. Jr.. & C. D. Spielberger (Eds.). 
Cognitive and affective lear ninQ strategies NewYoric: Academic Press. 

Darlington. R. B.. & Carlson. R M. (1987). Behavioral Statistics: Logic & Mcihotis . NY: The Free Press. 

Davison. M. L. (1983). MultMimenskinal Scaling New York: John Wiley & Sons. 

Davison, M.. Rwhards, R & Ro<«nds, J. (198Q. Multklimensional scaling in counseling reseaich and 
practKe. Journal of Coun> .ino and Davetopmant . gs. 178-184. 

de Beaugrande R. f1980). Taxl. digcoufse and p rneess Norwood, N. J.: Ablex. 

de Leeuw, J., & Stoop, I. (1984). Upper bounds for Kruskal's stress. Psychometiik a. jg, 391-402. 

Deikhoff. G. M. (1982). Cognitive maps as a way of presenting the dimensk)ns of comoarison within the 
history of psychology. Teaching of Pay chology 9, 1 15-1 16. 

Oiekhoff, G.M. (1983) Testing through relatwnshi^ judgments. Journal of Educational Psychoigg y, 25, 2, 
227-233. 



ERIC 



Concept Maps 
Page 32 

EverM. B. S. (1988). Cluster analysis, in J. P. Keeves (Ed.). Educational research, methodoloyv. and 
measurement: An intematio nal handtiook . (pp. 247-2S3). Nm York: Pefgamon Press. 

Fenker. R.M. (1975). The organization of conceptual materials: A methodology for measuring ideal and 
actual cognitive structures, instmcfjonai Science . 4, 33-57. 

Fisher. L. & Vetfi Ness. J. W. (1971). Admissable cdistering procedures. Biometrika . gg. 91 -1 04. 

Fitpatrick. A.R. (1983). The meaning of content valklity. Applied Psychotooteal Measurement . z(i). a-13. 

Fleiss. J. L (1 981 ) . Statistfcal met hods for rates and proportk)ns . New York: John Wiley & Sons. 

Frederiksen. C. H. (1975). Acquisitkm of semantic infonnatlon from discourse: Effects of repeated 
exposures. Journal of Verbal Learning and Verbal Behavtor. 14. 158-169. 

Frederiksen, C. H. (1977). Semantic processktg units in understanding text. In R. O. Freedle (Ed.). 
Discoufse production and comorehenskMi Norwood. N. J.: Ablex. 

Frederiksen, C.H. (1979). Discourse comprehensk>n and early reading. In LB. Resnick & R A. Weaver 
(Eds.). Theon/ and oracttee of early reading (Vol. 1). Hillsdale. N.J.: Eribaum. 

Freedle. R. O. (1979). Advances in discourse proce sses. Vol. 2. New directtons in discourse processinQ . 
NooATOOd. N.J.:Ablex. 

Geva. E. (1980). Meta texhjal notk)ns and re ading comorehenskm . Unpublished doctoral dissertatkKi. U. of 
Toronto. 

Geva. £. (1983). Facilitating reading comprehenskxi through ftowcharting. Reading Research Quarterly , 
1£(4). 385-406. 

Glaser. R. (1981). The future of testing: A research agenda for cognitive psychology and psychometrics. 
American Psvcholoqist. 36. 923-936. 

Graef. J.. & Spence. I. (1979). Using distance infonnatkHi in the design of large multidimenskxial scaling 
experiments. Psvchologteal Bulletin . 86. 60*66 

Griffeth. R.. Horn. P. OeNisi. A.. & Kirchner. W. (1985). A comparison of different methods of clustering 
countries on the basis of empk)yee attitudes. Human Relations . 2S. 81 3-340. 

Gulon, R.M. (1978). Scoring of content domain samples. Journal of Applied Psychotogy . 63. 499-506. 

Hagus. G. P. Requ«. B. R.. & Wilson. R. H. (Eds.) (1985). Heath social studies Lexington. MA: D. C. 
Heath. 

Hambleton. R. K. (1980). Test score valMKy and standard-setting methods. In R. A. Berk (Ed.). Criterton- 
referenced measurement: The state of the art BaHtmore: Johns Hopkins University Press. 

Hauf, M. B. (1971). Mapping: A technkjue for translating reading into thinking. Journal of Reading . 14. 225- 
230. 

Hays. W.L(1981). Sltfi^(3rd ed ). New York: Holt. Rinehart& Winston. 

Heimlfch. J. E. & Pittelman. S. D. (1986). Semantk: apptk;atk)ns: Classroom appltcatk>ns Newark. 
Delaware; InternalkKial Reading Association. 



ERIC 



33 



1 



Concept Maps 
Page 33 

Hoaey, C. D. & Dansereau. D. F. (1984). Ttw development of spatial learning strategies. In C. Honey and D. 
Dansereau (Eds.) Soatial leaminq strategies: Ta chnioiies. aoolic iitinng, artf fftlfltgtf 'mm, New Yortc 
Academic Press. 

Howell. K. W. & Kaplan. J. S. (1980). Diagnosing basic skills- Columbus. OH: Charles E Merrill. 

Johnson. S. (1967). Hierarchical dustering schemes. Psychometrika. 32. 241-254. 

Johnson-Laird. P N. (1980). Mental models in cognitive science. Cognitive Science . 4. 71-1 IS. 

Johnson4xiird. P N. (1983). Mental models . Cambriige. MASS: University Press. 

Johnson-LAird, P N. & Wason. P C. (1977) Thinking: Rea dings in Cognitive Science New York: 
Cambridge Univefsiiy Press. 

Johnston. P. H. (1984). Assessment in reading. In P D. Pearson (Ed.) Handbook of Reading Research. 
New York: Longman. 

Kavale. K. (1981). Functions of the Illinois Test of Psycholinguistic Abilities (ITPA): Are they •'ainaUe? 
ExceoBonal Children. 47. 496-513. 

Kintsch. W. & van Oljk, T. A. (197^. Toward a model of text comprehension and production. 
Psvchotoakial Review fig. 363-394. 

Kirsch. I.S. A Guthrie. J.T. (1980). Constmct validity of functional reading tests. Journal of Educattonal 
Measurement 17(2), 81-93. 

Kfuskal. J. B. & Wish. M. (1978). MultMimensional scaling Beverley HWs. CA: Sage Publications. 

LaPorte. R.E. & Voss. J.R (1979). Prose representation: A mutttdimenskxial scaling approach. Multivariate 
Behavioral Research 1^, 39-56. 

Levy. P (1987). Modelling cognition: Some current issues, (pp. 3-20) in P Morris (Ed.) Modelino Coonhion . 
Ney Yortc John Wiley and Sons. 

Mandler, J.M. & Johnson. N.S. (1977). Remembrance of things parsed: Stoty structure and recall 
Cognitive Psvchotogy 9 11M51. 

Messk*.S. (1980). Test validity and ethics of assessment. American Psychologist as 1019-1027- 

Messick. S. (1981). Evidence and ethk» in the evaluation of tests. Educational Researcher . 10. 9-20. 

Messick. S. (1989). Validity (pp. 13-105). In R. L. Linn (Ed.). Educational Measurement (Third EdHwn). New 
York: American Cou: il on Education & Macmillan Publishing Company. 

Meyer. B.J.F (1975). The oroanizalkw of prose a nd its effects on memory Amsterdam: North HoHand 
Publishing Co. 

Meyer. B.J.F. & Rice. G.E. (1984). The stmcture m text (Chapl. 1 1). In PD. Pearson. R. Barr. M.L. Kamil. & 
R Mosenthal (Eds.) Handbook of R eading Research NY: Longman. 

Meyer. B. F. Brandt. P M.. & Blulh. G. J. (1980). Use of (op-level structure in text: Key for reading 
comprchenswn in ninth-grade students. Reading Research Quarterly . 16. 72-103 



ERIC 



34 



Concept Maps 
Page 34 

Miligan, G. (1980). An examination of the affect of six types of enor perturtMtion on fiHeen ckistering 
aigonthms. Psvcfiomfltrika ^, 325-342. 

MUligan. G. (1981). A review of Monte Carlo tests of cluster analysis. IXultivafiate Batiaviofni RftsMrnh , j£. 

Miilwafd. R. B. (1985). Mind your mental models. Journal of Psvcfioiinqiiistic R«»tAiire h i ^ (5). 427^. 

Mojena, R. (1977). Hierarcfiical grouping mettKxls and stopping rules-an evaluation. Comntitar Jotimiii 
2Q< 359-363. 

Moray. L C. & Agresti, A. (1984). The measurement of classification agreement: An adjustment to the 
Rand statistic for chance agreement. Educational and Psvcholooical Mflasur««y>«i 44. 33-37. 

Niles^aS.(l965). Organization perceived. In H H. Hertw (Ed.). Persoeciives in readino: D^/atooin^ 
study skills in aecondaiygctmnk Newark. Del.: Intemationai Reading Assodirtion. 

NunnaHy. J. C. (1978). Psvchomatric ttwofy NY: McGraw-Hill Book Company. 

Preeoe. (197Q. Mapping cognitive structure: A comparison of methods. Journal of Educattonal Psychotogy . 
68(1), 1-8. 

Psvchomelfika \M. 51. No. 1. (1986). 

Ramsey. W. L. Gabriel. L A.. McGuirk. J. R. Phiips. C. R.. A Watenpaugh. F. M. (Eds.) (1985). \M 
•r.efal science New York: Holt. Rinehart & Winston. 

Ramsey. W. L. Gabriel. )_ A.. McGuirfc. J. R. Phillips. C. R.. & Watenpaugh. R M. (Eds.) (19851. HoHllfe 
SdfiDSfi. New Yoric: Wolt. Rinehait & Winston. 

Rand. W.M. (1971). Ot)jective criteria for the evaluatron of clustering methods. Journal of the Amertean 
Statistteal Association fifi ruR-nsn 

Raphael. T.E.. Engted. C.S.. & Kirschner. B.W. (1986). The impact of t» xt stnicture instmctfon and social 
COftfWt on audgfrta' comprehension and oroductior. of exnositoty t«.«f (Research Series No. 177). East 
Lansmg. Mk:h: The Institute for Research on Teaching. Michigan State Univ. 

Reutzel. O.R. (1986). Investigating a synthesized comprehension instnictkmal strategy: The ctoze stoiy 
map. Journal of Educational Rp*s«>^reh ZffP^) 'iAi.'iAQ 

Richardson. J. T E. (1983) Mental imagery in thinking and problem solving. (197-226) in J Evans Ed 
Thinking and reasoning: Psycholog ical appr^jt^h^y Boston: Routledge & Kegan Paul. 

RunwItMiit. D.E. A Ortony. A. (1977). The representation of knowtedgi. ig 11^^^,^ |n R.C. Anderson. R.J. 
Spiro. & WE. Montague (Eds ), pp. 99-135. 

Rumelhart. D.E. (1975). Notes on a schema for stories. In D.G. Bobrow A A M Collins (Eds ) 
Reoresentatfon and undaratanrfing New Yodc Academic Press. 

"T?!!f^ ?j:""^A**'?^ ° A process model for tong-term memory. lnE.Tulying 

* W. Donaldson (Eds.). Oroanization of nM»m^ New Yoric Academk: Press. 

Schan -, R.C. A Abelson. R.R (1977). Scripts olans goals and iinderstandino Hillsdale. N.J.: Eribaum. 



ERIC 



35 



Concept Maps 
Page 35 

Schiffman. S.. Reynolds. M. & Young. F. (1981). Introduction to multtdimension al scaling: Thftory. methods 
and applications. San Francisco. CA: Academic Press. Inc. 

Schwartz. R.M. (1984). Measurino Raadinq Competence: A theoretical-o rescriptive approach . NY. Plenum 
Press. 

Shaveison. RJ. (1974). Methods for examining representations of a suhject matter stnjcture in a student's 
memory. Journal of Research in Science Teachino 11.231-249. 

Shepard. R. N.. KHpatrick. O. W.. & Cunningham. J. R (1 975). The internal representation of numK>ers. 
Cognitive Psychology. Z. 82-138. 

Sinatra, R.C.. Stahl-Gemake. J. & Morgan. N.W. (198Q. Using semantic mapping after reading to organize 
and write original discourse. Journal of Reading . 30(1). 2-13. 

Sneath. R & Sokal. R. (1973). Numerical taxonomy . San Francisco. CA: Freeman. 

Spence. I. & Oomoney. 0. (1974). Single i$ut)ject incomplete designs for nonmetric multidimensional 
scaling. Psvchomelrika. 39 46^^90 

Spence. I. (1982). Incomplete experimental designs for multidimensional scaling. In R. G. Golledge & J. N. 
Rayner (Eds.). Proximitv and preference: Problems in the multidimensional analysis of large data sets . 
:\4inneapolis: University of Minnesota Press. 

Spence. 1. (1983). Monte Carlo Simulation Studies. Applied Psychotooical Measurement . 7. 405-425. 

Spiro. R.J. (1 977). Rememt)ering information from text: The state of the schema approach. In R.C. 
Anderson & W.E. K^ontague (Eds.). Schoolino and the acouisition of knowledge . Hillsdale. N.J.: 
Eribaum. 

Stanners. R.F. Brown. LI, Price. J M.. & Holmes. M. (1983). Concept comparisons, essay examinations, 
and conceptual knowledge. Journal of Educational Psvchotoov . ZS. 6. 857-864. 

Stanners. R.F. Price. J.M.. & Painton. S. (1982). Interrelationships among 3ext elements in fkrtkxial prose. 
Applied Psvc holinQuistics . 2. 95-107. 

Stasz. C. Shaveison. R J.. Cox. O.L.. & Moore. CA. (1976). FieM independence and the stnicturing of 
knowledge in a social studies minteourse. Journal of Educational Psychoioov . gfi. 550-558. 

Stein. N.L. & Glenn. C.G. (1979). An analysis of story comprehension in elementary school chiMren. In R. 
O. Freedle Ed.), New directions in dis course orocessino . Noofvood. N.J.: At)lex. 

Sternt)erg. R. J. (1981). Testing and cognitive psychology. American Psychologist . 36. 1 181-1 189. 

Surt>er. J. R. & Smith. R L. (1981). Testing for misunderstanding. Educational Psychologist . 16. 163-1 74. 

Surber. J. R. (1 984). Mapping as a testing and diagnostic device. In C. Hoiiey & 0. Oansereau (Eds.) , 
Spatial Leamtno Strategies: Technioues Applications, an d related issues New York: Academk: Press. 
Inc. 



Thorndike. R.L. & Hagan, E. (1977). Measurement and evaluation i n education and psvchokxy (4th ed ). 
NY: Wiley. 



36 



Concept Maps 
Page 36 



Thorndyke, RW. (1977). Cognitive stnjctures in comprehension and memory of narrative discourse. 
Cognitive Psychology, a. 77-110. 

TratMisso, T. (1978). Coonitive prereoufeates to reading Paper presented at the meeting of the American 
Educational Research Association, Toronto, March, 1978. 

Valencia, Pearson, & Chapman, (198Q. New strategies for reading comprehension assessment-Illinois 
initiatives. Ill: Center for the Study of Reading. 

vanDijk,T.A.&Kintsch,W. (1983). Strategies for discourse comprehension New York: Academic Press. 

Wagoner, M. & Wonder, K. F. (1 98^. Spatial representations and inference processes in memory for text 
(p. 1 15*13Q. in G. Rkddioit & H. Strohner (Eds.) Inferences in text ofocessing North-Holland: Elsevier 
Science Put)lishefs B. V. 

Ward, J. (1963). Hierarchk^l grouping to optimize an ot)tective function. Journal of the American Statistical 
Association. 23&2A^. 

Wetner, H. & Kaye, K. (1974). Multidimensional scaling of concept learning in an introductory course. 
Journal of Educational' Psychok)gy, gg. 591 -598. 

Wilensky, R. (1978). Why John married Maty: Understanding stories involving recurring goals. Cognitive 
S^SDSQ. 2.235*266. 

Wilkinson, L (1989). SYSTAT: The s ystem for statistics . Evanston, IL SYSTAT, Inc. 

Winograd, R N. (1984). Strategic difficulties in summarizing texts. Reading Research Quarteriy . 19, 404- 
425. t 

Woodcock, R. W. & Johnson, M. B. (1977). Woodcock -Jo hnson Psvcho^Educational Batten/. Part Two: 
Tests of Achievement , Allen, TX: OLM Teaching Resources. 

Jastak & Wilkinson, (1984). The Wkte Ranoe Achievement Test-Revised . Wilmington, OE: Jastak 
Associates, Inc. 

Young, F. W. & Lewyckyj, R. (197$). AI-SCAL-4 User's Guide University of North Carolina, Chapel Hill: 
Data Analysis and Theory Associates 

Young, F. (1984). Scaling. Annual review of Dsvcholoov . 35» 55-81. 



37 



1 



Concept Maps 
Page 37 

Table 1 

Aoreement Among Five Raters on 2a-!tem Concert Comparison Tests Based on Eight Passages 
From Science and Social St udies Basal Texts 



Cohen's Kapp a 





Intradass Correlation 






TheHeart^ 


^1 


.49/. 70 = 


.71 


.60 


'Igneous Rocks'* 


.81 


.51 / .65 = 


.78 


.60 


'Population Limits' 


.71 


.27/. 79 = 


.35 


.43 


'One-celled Animals' 


.73 


.40/. 75 = 


.54 


.49 


The Seashore' 


.73 


.28/. 59 = 


.48 


.40 


The Skeletal System** 


.87 


.47/. 83 = 


.57 


.62 


'Soviet Union' il 


.69 


.28/. 71 = 


.39 


.39 


Texas' 


.65 


.32/.90 = 


.36 


.47 



All Coefficients are significant beyond the .01 level. 
*Three most reliable passages selected for Phase IL 

^ The ratio of Kappa to the maximum possible Kappa value for the given table. 

2 Weighted Kappa: linear weights of 0. .25. .50. and 1 are assigned according to degree of discrepancy 
between raters. 



3b 



ERIC 



Concept Maps 
Page 38 

Table 2 

Medians and tntefOuartHe Ranges for Normrttzed Pafcantnes in Reading Compfehension and VocAbutatY 
for 104 Jr. and Sr. Hkih School Students Saived Chaptar I Comoensatofv and Cateoorical Sp flcbrf 
Education Programs. 



Jr High (n=53) Special Ed. (nr^) 

Readino Como. Vocabulaiy 

IQB* IQB* 
20 13 21 13 

Sr High (n=51) Special Ed. (n=23) 

Md IQB* IQB* 

29 20 24 16 



Chapt. I (n=33) 

ReadComp. ' 'ocab olsi y 

Md IQB* Mil 1^* 

23 14 19 9 

Chapt. I (n=28) 

Md IQB* M IQB' 
20 13 21 10 



IQR = Interquartile Range: spread of the middle half of scores clustered abotA the Median. 



ERIC 



39 



Concept Maps 
Page 39 



Tables 

Design Elements: Ot»afvatio.is and Exoerimantal Condaions Acfoaa Time by Group 



0^ ^2 O3 Reading: O2 O4 O5 

Extant Pro-test Related:^) Post-test Multiple 

Ach.Scoies CC Maze Unrelated: (Xp) CC Choice O.R.F. 



I. Tre^-:snt (ns54) 

Jmeaif (26) O1 O2 O2 

Sr.:'Skettar (28) 

II. Control (n=53) 

Jr.tlieair (28) Ot O2 Xu O2 

Sr.: "Skenal' (25) 



I. Treatment (ns4^ 

Jr.* Sr. -Rocks" 0^ O2 O3R Xr O2 O4R O5 

II. Control (n=49) 

JrASr. "Rocks" 0^ O2 O3U O2 



Note: 

C C = concur comparisons 
0. R. F. = oral reading fluency 



40 



Concept Maps 
Page 40 

1kble4 

Thrse-wayANv a for Peoendenl Vaf iabte. •Concert Comoa f ison Scoras' with One Group ing Variahte, 
"Piooram" nn* FypitrimAntAl \/tortohto "RAa^jng PflSSflflft ". and One Rapaated Measure Timw t* 
Asseasmenr. lAt. Hioh Grade Lewalr -Ttie H eart" Passage fN^sa) ) 



Source of Variance & (Levels) 


SSbt 


SSw 


F(1.49) 


e 


U 


Between Subject Effects: 












Read. (Related, Unrelated) 


.264 


2.51 


5.16 


.03 


.31 


Prog. (SPED. C»iapt.1) 


.55 


2.51 


10.84 


.002 


.43 


Read. X Prog. 


.01 


2.51 


.24 


.63 


.07 


Witfiin Subjects Effects: 












Time (Pre. PosO 


1.55 


.53 


143.08 


.000 


.86 


Time X Read. 


.42 


.53 


39.14 


.000 


.67 


Time x Prog. j 


.06 


.53 


5.65 


.02 


.32 


Time x Read, x Prog. 


.003 


.53 


.27 


.61 


.07 



ERIC 



4i 



Concept Maps 
Page 41 

Tables 

ThfW-wav ANQVA for Dependent Variable. iConceot Comparison Scofes' with One Grouoino Nfariahte 
T^fQQfam" . one Experimental VwiaNa. -Raadlog P^a^apft' a nd One Raoaated Measure Timw >rf 
Assessmenr. (Sr. Hloh Grade Le^wl: The Skeumii fjytim' Pass age ftJ^sott 



s»iu(» or variance & fLeveis) 


ssw 


Sow 


F(1.4a 


6 


n 


Between Subject Effects: 












Read. (Related, Unrelated) 


1.12 


4.74 


10.89 


.002 


.44 


Prog. (SPED, Chapt. 1) 


.008 


4.74 


.08 


.78 


.04 


Read. X Prog. 


.04 


4.74 


.39 


.53 


.09 


Within Sut>jects Effects: 












Time (Pre, Post) 


2.34 


1.23 


86.93 


.000 


.81 


Time x Read. 


.839 


1.23 


31.14 


.000 


.64 


Time X Prog. 


.032 


i23 


1.17 


.28 


.16 


Time x Read, x Prog. 


.024 


1.23 


.88 


.35 


.14 



ERIC 



42 



Concept Maps 
Page 4? 



Table 6 

Three-way ANQVA for Dependent Variable. -Concept Comparison Scores" with One firoupinQ Variable 
Trooram" . one Experimental Variable. •Re ading Passage", and One Repeated Measure Time of 
Assessment". Sr. Hian ( \ and Jr. Hioh ( \ Grade I evels. "innft ous Rocl(s" Passage 



Source of Variance t (I mmIs) 




SSw 


F(1.88) 


C 


a 


Between Subject Effects: 












Read. (Related, Unrelated) 


.707 


5.16 


12.07 


.001 


.35 


Prog. (SPED, Chapt. 1) 


.000 


5.16 


.00b 


.94 


0.0 


Read. X Prog. 


.102 


5.16 


1.74 


.19 


.14 


Within Subjects Effects: 












Tune (Pre. Post) 


1.71 


.927 


162.39 


.000 


.81 


Time X Read. 


.537 


.927 


50.98 


.000 


.61 


Time X Prog. 


.068 


.927 


6.43 


.01 


.26 


Time X Read, x Prog. 


.05 


.927 


4.74 


.03 


.23 



43 



Concept Maps 
Page 43 

Tables. 

Pfe- and Post-Readino Concept Comparison Scofes (Peafson r'st ^ith Reading of Related or 
Unrelated Passage 



The Heart": Jr. High 

Special Ed. far20) Chapt. I <n=33t Total fn=S3) 

CiB Essi CIS Easi Cib 

MSDMSDMSD MSQMSDt^SQ 

Related -.046 .151 .367 .153 .138 .167 .419 .160 .0^4 .182 .401 .157 

Un- -.029 .065 .127 .137 .176 .132 .243 .197 095 .15 .196 .181 
Related 

"The Skeletal and Muscular System": Sr Hicjiii 

Special Ed. fa= 23) Chapt. I fa=28) Total fn=51) 

Ere Pjssi Ei& Es^ Ecs Ess! 
M SI2MSDMSD tfl^MSQidSQ 

Related ,068 J20 .434 .273 .088 .20 .CI 2 .21 .078 .198 .474 .24 

Un- .076 .139 .165 251 .Wi .154 .199 .222 .056 .146 .184 .23 

Related 

■Igneous Rocks': Jr h.na Sr. High 

SpeCI»t.&Ua=311 Chaot. I fn=60> Total (n=9H 

Erfi Ejs! E£S ESfit Eie Pse! 

Related .082 2^ .428 504 .107 .173 .332 .183 .1 .181 .357 .192 

Un- .049 .192 .143 .149 .106 .1J6 .188 .146 .083 .162 .169 .147 

Related 



ERIC 



44 



Concept Maps 
Page 44 

Tables 

De8CriL<ive Data for Pre- and Postest CC. Published Stand ardized Group Reading Teiits Maya Tests. 
Multiple Ctioice Tests, and Oral Reading Fluency in = 38). 



Test 


Min. 


M 


Max 




Pre-Reading CC (Pearson r) 


-.23 


.09 


24 


.31 


Post-Reading CC (Pearson f) 


.11 


.51 


.79 


.26 


Std. Reading (percentile) 


1. 


20. 


62. 


14.8 


Std. Vocabulary (percentile) 


1. 


19. 


60. 


13.8 


Maze (percent correct) 


17. 


66. 


92. 


24.0 


Multiple Choice (percent correct) 


20. 


53. 


80. 


18.0 


Oral Reading Fluency (wcpm) 


22. 


91. 


146. 


29.7 



Concept Maps 
Page 45 

Tables. 

COffrtation of Prefleadinq and Post-ReadinQ Concept Comp a risons wHh Five Criteria: the Maze. Multiple 
Ctwice. Qral Readino Fluency, and Standafdized Reading and VocatMilarv Tests M = 39) 



PreC.C EsstlLC. Maze M. Choice BeasL^. Vi2iaSL^. 



Post C.C. 


.42* 


• 








M2.:e 


.28 


.61* 


• 






M.Choice 


.15 


.45* 


.75* 


• 




Read. Std. 


.21 


.36 


.36 


.38 


• 


Vocab. Std. 


.19 


.38 


.43* 


.38 


.66* 


O.R.R 


.15 


.57* 


.SO* 


.51* 


.37 .45* 



*p<.01 



Concept Maps 
Page 46 



Figure Caption 

Figure 1 . Interpretation of Concept Clusters and Concept Relationships on an MOS Map. 



-2 



Concept Cluster Interpretation 



COMPOSmONABASiC 
MOVEMEMT 

contracts 




MiAINPAPTSAr 
CONNFCmR 



tissue 



FJTTFJWAl PART 

O artery 



atrium 
chaml)er 



valve 



-1 0 1 

Concept-Relatjonship Interpretation 



contracts 



tissue 



ACENVACrlON^ 

• ^»^<^^cardiac 

CLASS/ 
iEMBER 



MEMBERS/CLASS 



PHYSICAL CONNECTION 



SEND /RECEIVE 



7\ 

• artery 



PHY* 



•atrium 

fsENDSmECaVES 

• ventricle 

• chamber 



J'HYSiCAL CONNECTION 
•'^ MOVEMEtn/ MODULATION 

valve 



ERIC 



Figure Caption 
Eigu£&2. Concept Co^^>ari$on1ttk for Multidim^^ 



Concept Maps 
Page 47 



Student: 


Grade 


School 






Teacher 




Date: _J 


1 


1 


Passage: ITie IHeart 



ielahon 

4 



atrium - cardiac 
tissue • cardiac 
tissue • ctiamber 
valve • chamber 
contracts - ventricle 
valve - tissue 
valve - cardiac 
artery • chamber 
ventricle - tissue 
, artery - contracts 
ventricle - chamber 
cardiac - chamber 
ventricle- valve 
contracts -tissue 
atrium - valve 
contracts • cardiac 
atrium - ventricle 
contracts - valve 
atrium - tissue 
atrium - controcts 
artery - tissue 
artery • cardiac 
ventricle • cardiac 
artery - valve 
artery - ventricle 
contracts - chamber 
atrium - chamber 
artery • atrium 



TSM 17 

■ELATION 
1 



ERIC 



48 



Concept Maps 
Page 48 

Figure Caption 

EiguifiJ- The Hearr Science Text Passage with Underlined Key Vocatxdafy Words. 



The Heart 
(Heath Life Science, pp. 45(M51) 

Your heart is a cone-shaped organ that is found in the middle of your chest, '"le heart is aixMt the size of a large fist. 
You rnay think that purnpingtiloodthia^ the entire tx)dy is a big job ft^ But your heart is made of a 

spedaltjauscalledaQljacmuscle. This strong muscle figilBiS&puinping blood every second of the d^ 
tired. In ^ your heart pumps behraen 60 and 80 times a mirwte every day. An aduK heart pu^ 
eachminute! 

The heart is reaflytwK) pumps that le side by side. TTie right pump is separated from the left pump by a muscular vvaH. 
TherearefW' ■ompartmerteordHnteaintheheart. Each upper chamber Is cated an atrium. An atrium is a small. 
thuMwaledcr^ oer that receives blood from the fanosof the body. Each lower chamber is caBad a ventricle . A ventricle 
isathick, muscular tiiaaiSm that puinps blood to the hjngs or the body. 

ThereisayalfflbelweeneachatdlfflandyfialDdS- irteyalyeworics like a one-way door. Btood can only flow from an 
HdlflDtoasffliiij^. Bkwd in the ventricle can never flow back into the atrium because the v^ctoses as the tikiod 
leaves. 

Oifferert kinds of special vessels cany bk)od through the body. One kind of vessel is called an artery . Arteries are 
bkNxJ vessels that cany bkxxJ away from the heart. The walls of adSQK are very elastk:. 

[255 mds] 



ERIC 



49 



Concept Maps 
Page 49 



Figure Caption 

BausA. Hierarchical Cluster Analysts Solution for MOS Map Configuration. 



The Heart" 
L 



Three-Cluster 
Solution: 



Composition 
&l)asic 
moveniGnt. 



Main parts Separate 
&oonnectors. txxlypaft. 



Five^ausler p-^^JJ,,^ toic"^"^ 
SoljL^^ 



Main 



=1 

Connector. 



:t 

Separste 
bodyrpart. 



1 



I 



E 

3 



o 
e 
< 




0.5 



1.0 1.5 2.0 

Joining Distances 



2.5 



3.0 



ERIC 



50 



Concept Maps 
Page 50 

Figure Caption 

Eiguffi5. ANOVA Interaction Plots for Time" x "Read" for Junior High sttidents. Senior High students, and 

Combined Grade Levels. 

.501 1 




Pre-Reading Post-Reading 
Time 



ERIC 



51 



Concept Maps 
Page 51 

Figure Caption 

EigU£ejS. Hierarchical Cluster Aiudysis of Correlation Matrix: Pre-Reading and Post-Reading Concept 
Comparison Scores, the Maze. Multiple Choice. Oral Reading Fluency, and Published Reading and 
Vocabulary Tests (N « 39). 



PreC.C. 
Read.Std. 
Vbcab.Std. 
Mutt. Choice < 
Maze • 
O.R.R . 
PostC.C. - 



-.19- 



-16- 



—.33- 



.58 
.19 
.64 
.16 
.33 
.25 



ERIC 



52 



Appendix A 



Concept Maps 
Pane 52 



Passages Which Served as Basts for Conce ot Comoaf ison Tafth ^ 



The Skeletal and Muscular Systems 
(Holt General Science, pp. 525-527) 



Organs working together make up systems. Two of these systems are the skeletal system and the 
muscular system. 

The human skeleton is made up of bone and cartilage. One difference t)etween the two is that 
cartilage does not contain the catetum or phosphoms compounds that bone contains. This makes 
cartilage more flexitito than bone. 

There are 206 l)ones in the human skeleton. Sorne of these l)ones are connected to each other t)y 
ligaments. Since ligaments stretch easGy. they aOow the Iwnes to move freely. This fonns what is called 
a movat>te joint. 

Joints can altowmovemert in differertdirectk)ns. A hinge jrt^ A 
ball and socket joint aUows rotational movement. 

The insMe surface of most joints is covered with cartilage. Joints also contain a spedai fluM that 
lut>ricates them so t hey do not wear each other away. 

Movernem at the joims and other parts of the t)ody is caused i)y the' s. The muscles of the 
amis and legs are examples of musdes that akl us in movement. These a. ^led voluntary muscles. 
There are some muscles like the ones found In the digestive, respiratory, and circulatory systems that 
are involuntary. 

All muscles work only b^ contracting. Since they only work l>y contracting, they can only pull. They 
cannot push. If one set of muscles pulls on a tendon to bend a joJnt. another set of muscles must pull on 
a different tendon to straighten the same joint. 

[243 words] 



Igneous Rocks 
(Holt Science, pp. 82-83) 



Heat deep inside the earth causes some rocks to melt. Red-hot. melted rock under the earth's 
surface is called magma. Sometimes, the magma pushes out through a crack or a weak spot in the 
earth's crust. Red-hot melted rock coming out of the earth is called lava. The lava piles up, cools, 
hardens, and fonns a mountain of solM rock. This kind mountain is called a voteano. 

Rocks that form from melted material that cools and hardens are called igneous rocks. The word 
igneous means 'coming from fire*. Hardened lava is one kind of igneous rock. The way the rock kx>ks 
depends on how fast the lava cooled. 

The lava cools stowly as a voteano l)ecomes inactive. Rocks formed l)y the slow cooling of melted 
material have large crystals. Crystals are the stmctures that minerals form when they are sdkl. Gabbro 
IS an igneous rock that has large crystals of many minerals. 

In active voteanos. the lava is mixed with hot gases. The lava expkxles. or erupts, through a small 
hole in the earth s surface. When this fiappens. the hot material often cools qutekly. There is no time for 
crystals to forni. The lava hardens and k)ok8 like a glass rock. This kind of rock Is called obsidian. 

^ tava cools so fast that the hot gases mixed with the lava do not have time to escape. They 
become trapped mskJe the hardened lava and form a spongy rock light in cotor. This kind of taneous 
rock is called pumice. 

1252 words] 



ERIC 



53 



Appendix B 



Concept Maps 
Page 53 



Average Expert Teacher Maps 

The Heart" 
Three-C luster Solution (Stress s .031) 
f — I I 



,-s.corTtracts 
cardiac / •) 



\ atrium 



tissue ( ♦ /• I 

/ventricle 



artery valve 



" J — 1 ' 

•2-1 0 12 

'Igneous Rocks' 
Four-Ctuster Solution (Stress = .078) 



obsidian 

® 

pumice^ Sabbro crystals 

^ % lava — ^ 

volcano 
magma\* 

crust 



The Skeletal and Muscular System* 
Three-Cluster Solution Stress = .007) 



. . circ liatory 

tendon ^^involunta^ 

^.^'lubricates •a^estn.e 



calcium 



L* • . A lI 

2 10 12 

54 



Appendix C 

Pre-ReadinQ and Post-Read ino Maos for AHce and Bob 



Concept Maps 
Page 54 



The Heart" - Alice - Pretest 
r = .12.T b= .08. Q = (Stress = .0095) 




-2 .1 0 

The Heart- - Alice • 
r= .46. T-b = .36. Q =.61 


1 2 

Rosiest 
(Stress = .046) 


T ^ 1 

tissue 


' \ 


H 












oootrac^ \ 










chamber 






atrium 

J 1 1 



■7 



ERIC 



Appendix C 



Concept Maps 
Page 55 



Pre-Readina and Post Read mo Maos for Alice and Rnf ^ 



"The Heart' - Bob - Pretest 
r = -.09. T - b = .12, n = .33 (Stress = .064) 



1 1 ~ 


1 1 r 

valve 








I \artery 


chamber 


V 7 














tissuo 

i i 


contracts 

— 1^ 



The Heart' - Bob - Postest 
r = .39. T- b = 40. Q = 74 (Stress = .079) 




r)f; 



